Stata Drop a Random Subset of Observations (DO, more see: Fan and Stata4Econ)
-----------------------------------------------------------------------------------------------------------------------------
name: drop_random_subset
log: C:\Users\fan/Stata4Econ//rand/basic/fs_droprand.smcl
log type: smcl
opened on: 6 Oct 2019, 18:50:13
. log on $st_logname
(log already on)
.
. ///-- Site Link: Fan's Project Reusable Stata Codes Table of Content
> di "https://fanwangecon.github.io/"
https://fanwangecon.github.io/
. di "https://fanwangecon.github.io/Stata4Econ/"
https://fanwangecon.github.io/Stata4Econ/
.
. ///-- File Title
> global filetitle "Stata Drop a Random Subset of Observations"
.
. ///--- Load Data
> set more off
. sysuse auto, clear
(1978 Automobile Data)
.
. ///--- Generating Index for Dropping
> set seed 987
. scalar it_drop_frac = 3
. gen row_idx_it = round((_n/_N)*it_drop_frac)
. gen row_idx_rand = round(it_drop_frac*uniform())
.
. //--- drop when row_idx_it == row_idx_rand, if it_drop_frac set at 3
. list make price mpg row_idx_it row_idx_rand, ab(20)
+--------------------------------------------------------------+
| make price mpg row_idx_it row_idx_rand |
|--------------------------------------------------------------|
1. | AMC Concord 4,099 22 0 0 |
2. | AMC Pacer 4,749 17 0 2 |
3. | AMC Spirit 3,799 22 0 1 |
4. | Buick Century 4,816 20 0 3 |
5. | Buick Electra 7,827 15 0 1 |
|--------------------------------------------------------------|
6. | Buick LeSabre 5,788 18 0 1 |
7. | Buick Opel 4,453 26 0 2 |
8. | Buick Regal 5,189 20 0 2 |
9. | Buick Riviera 10,372 16 0 1 |
10. | Buick Skylark 4,082 19 0 2 |
|--------------------------------------------------------------|
11. | Cad. Deville 11,385 14 0 1 |
12. | Cad. Eldorado 14,500 14 0 2 |
13. | Cad. Seville 15,906 21 1 1 |
14. | Chev. Chevette 3,299 29 1 2 |
15. | Chev. Impala 5,705 16 1 1 |
|--------------------------------------------------------------|
16. | Chev. Malibu 4,504 22 1 2 |
17. | Chev. Monte Carlo 5,104 22 1 2 |
18. | Chev. Monza 3,667 24 1 2 |
19. | Chev. Nova 3,955 19 1 3 |
20. | Dodge Colt 3,984 30 1 1 |
|--------------------------------------------------------------|
21. | Dodge Diplomat 4,010 18 1 2 |
22. | Dodge Magnum 5,886 16 1 2 |
23. | Dodge St. Regis 6,342 17 1 1 |
24. | Ford Fiesta 4,389 28 1 0 |
25. | Ford Mustang 4,187 21 1 1 |
|--------------------------------------------------------------|
26. | Linc. Continental 11,497 12 1 3 |
27. | Linc. Mark V 13,594 12 1 2 |
28. | Linc. Versailles 13,466 14 1 2 |
29. | Merc. Bobcat 3,829 22 1 3 |
30. | Merc. Cougar 5,379 14 1 2 |
|--------------------------------------------------------------|
31. | Merc. Marquis 6,165 15 1 1 |
32. | Merc. Monarch 4,516 18 1 3 |
33. | Merc. XR-7 6,303 14 1 2 |
34. | Merc. Zephyr 3,291 20 1 1 |
35. | Olds 98 8,814 21 1 1 |
|--------------------------------------------------------------|
36. | Olds Cutl Supr 5,172 19 1 1 |
37. | Olds Cutlass 4,733 19 2 1 |
38. | Olds Delta 88 4,890 18 2 1 |
39. | Olds Omega 4,181 19 2 1 |
40. | Olds Starfire 4,195 24 2 1 |
|--------------------------------------------------------------|
41. | Olds Toronado 10,371 16 2 3 |
42. | Plym. Arrow 4,647 28 2 1 |
43. | Plym. Champ 4,425 34 2 0 |
44. | Plym. Horizon 4,482 25 2 1 |
45. | Plym. Sapporo 6,486 26 2 2 |
|--------------------------------------------------------------|
46. | Plym. Volare 4,060 18 2 1 |
47. | Pont. Catalina 5,798 18 2 2 |
48. | Pont. Firebird 4,934 18 2 1 |
49. | Pont. Grand Prix 5,222 19 2 1 |
50. | Pont. Le Mans 4,723 19 2 2 |
|--------------------------------------------------------------|
51. | Pont. Phoenix 4,424 19 2 1 |
52. | Pont. Sunbird 4,172 24 2 2 |
53. | Audi 5000 9,690 17 2 3 |
54. | Audi Fox 6,295 23 2 2 |
55. | BMW 320i 9,735 25 2 2 |
|--------------------------------------------------------------|
56. | Datsun 200 6,229 23 2 1 |
57. | Datsun 210 4,589 35 2 3 |
58. | Datsun 510 5,079 24 2 2 |
59. | Datsun 810 8,129 21 2 3 |
60. | Fiat Strada 4,296 21 2 0 |
|--------------------------------------------------------------|
61. | Honda Accord 5,799 25 2 3 |
62. | Honda Civic 4,499 28 3 2 |
63. | Mazda GLC 3,995 30 3 1 |
64. | Peugeot 604 12,990 14 3 2 |
65. | Renault Le Car 3,895 26 3 3 |
|--------------------------------------------------------------|
66. | Subaru 3,798 35 3 1 |
67. | Toyota Celica 5,899 18 3 3 |
68. | Toyota Corolla 3,748 31 3 1 |
69. | Toyota Corona 5,719 18 3 0 |
70. | VW Dasher 7,140 23 3 1 |
|--------------------------------------------------------------|
71. | VW Diesel 5,397 41 3 0 |
72. | VW Rabbit 4,697 25 3 3 |
73. | VW Scirocco 6,850 25 3 3 |
74. | Volvo 260 11,995 17 3 2 |
+--------------------------------------------------------------+
.
. ///--- Drop approximately 1/2 of make randomly
> set seed 987
. scalar it_drop_frac = 2
. clonevar make_wth_mimssing = make
. replace make_wth_mimssing = "" if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform())
(33 real changes made)
.
. ///--- Drop approximately 1/3 of mpg randomly
> set seed 987
. scalar it_drop_frac = 3
. clonevar mpg_wth_mimssing = mpg
. replace mpg_wth_mimssing =. if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform())
(21 real changes made, 21 to missing)
.
. ///--- Drop approximately 1/5 of mpg randomly
> set seed 987
. scalar it_drop_frac = 5
. clonevar price_wth_mimssing = price
. replace price_wth_mimssing =. if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform())
(16 real changes made, 16 to missing)
.
. ///--- Summarize
> codebook make*
-----------------------------------------------------------------------------------------------------------------------------
make Make and Model
-----------------------------------------------------------------------------------------------------------------------------
type: string (str18), but longest is str17
unique values: 74 missing "": 0/74
examples: "Cad. Deville"
"Dodge Magnum"
"Merc. XR-7"
"Pont. Catalina"
warning: variable has embedded blanks
-----------------------------------------------------------------------------------------------------------------------------
make_wth_mimssing Make and Model
-----------------------------------------------------------------------------------------------------------------------------
type: string (str18), but longest is str17
unique values: 41 missing "": 33/74
examples: ""
""
"Cad. Deville"
"Mazda GLC"
warning: variable has embedded blanks
. summ mpg* price*
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
mpg | 74 21.2973 5.785503 12 41
mpg_wth_mi~g | 53 21.15094 6.41931 12 41
price | 74 6165.257 2949.496 3291 15906
price_wth_~g | 58 6254.069 3219.379 3291 15906
. list make* mpg* price*
+----------------------------------------------------------------------------+
| make make_wth_mimssing mpg mpg_wt~g price price_~g |
|----------------------------------------------------------------------------|
1. | AMC Concord 22 . 4,099 . |
2. | AMC Pacer AMC Pacer 17 17 4,749 4,749 |
3. | AMC Spirit AMC Spirit 22 22 3,799 3,799 |
4. | Buick Century Buick Century 20 20 4,816 4,816 |
5. | Buick Electra Buick Electra 15 15 7,827 7,827 |
|----------------------------------------------------------------------------|
6. | Buick LeSabre Buick LeSabre 18 18 5,788 5,788 |
7. | Buick Opel Buick Opel 26 26 4,453 4,453 |
8. | Buick Regal Buick Regal 20 20 5,189 5,189 |
9. | Buick Riviera Buick Riviera 16 16 10,372 . |
10. | Buick Skylark Buick Skylark 19 19 4,082 4,082 |
|----------------------------------------------------------------------------|
11. | Cad. Deville Cad. Deville 14 14 11,385 11,385 |
12. | Cad. Eldorado Cad. Eldorado 14 14 14,500 14,500 |
13. | Cad. Seville Cad. Seville 21 . 15,906 15,906 |
14. | Chev. Chevette Chev. Chevette 29 29 3,299 3,299 |
15. | Chev. Impala Chev. Impala 16 . 5,705 5,705 |
|----------------------------------------------------------------------------|
16. | Chev. Malibu Chev. Malibu 22 22 4,504 4,504 |
17. | Chev. Monte Carlo Chev. Monte Carlo 22 22 5,104 5,104 |
18. | Chev. Monza Chev. Monza 24 24 3,667 3,667 |
19. | Chev. Nova Chev. Nova 19 19 3,955 3,955 |
20. | Dodge Colt 30 . 3,984 3,984 |
|----------------------------------------------------------------------------|
21. | Dodge Diplomat 18 18 4,010 4,010 |
22. | Dodge Magnum 16 16 5,886 5,886 |
23. | Dodge St. Regis 17 . 6,342 . |
24. | Ford Fiesta Ford Fiesta 28 28 4,389 4,389 |
25. | Ford Mustang Ford Mustang 21 . 4,187 4,187 |
|----------------------------------------------------------------------------|
26. | Linc. Continental Linc. Continental 12 12 11,497 11,497 |
27. | Linc. Mark V 12 12 13,594 13,594 |
28. | Linc. Versailles 14 14 13,466 13,466 |
29. | Merc. Bobcat Merc. Bobcat 22 22 3,829 3,829 |
30. | Merc. Cougar 14 14 5,379 5,379 |
|----------------------------------------------------------------------------|
31. | Merc. Marquis 15 . 6,165 . |
32. | Merc. Monarch Merc. Monarch 18 18 4,516 4,516 |
33. | Merc. XR-7 14 14 6,303 6,303 |
34. | Merc. Zephyr Merc. Zephyr 20 . 3,291 3,291 |
35. | Olds 98 Olds 98 21 . 8,814 8,814 |
|----------------------------------------------------------------------------|
36. | Olds Cutl Supr 19 . 5,172 . |
37. | Olds Cutlass Olds Cutlass 19 19 4,733 4,733 |
38. | Olds Delta 88 18 18 4,890 4,890 |
39. | Olds Omega 19 19 4,181 4,181 |
40. | Olds Starfire 24 24 4,195 4,195 |
|----------------------------------------------------------------------------|
41. | Olds Toronado Olds Toronado 16 16 10,371 10,371 |
42. | Plym. Arrow Plym. Arrow 28 28 4,647 4,647 |
43. | Plym. Champ Plym. Champ 34 34 4,425 4,425 |
44. | Plym. Horizon 25 25 4,482 4,482 |
45. | Plym. Sapporo 26 . 6,486 . |
|----------------------------------------------------------------------------|
46. | Plym. Volare 18 18 4,060 4,060 |
47. | Pont. Catalina 18 . 5,798 . |
48. | Pont. Firebird 18 18 4,934 4,934 |
49. | Pont. Grand Prix 19 19 5,222 5,222 |
50. | Pont. Le Mans 19 . 4,723 . |
|----------------------------------------------------------------------------|
51. | Pont. Phoenix Pont. Phoenix 19 19 4,424 4,424 |
52. | Pont. Sunbird 24 . 4,172 . |
53. | Audi 5000 Audi 5000 17 17 9,690 9,690 |
54. | Audi Fox 23 . 6,295 . |
55. | BMW 320i 25 . 9,735 9,735 |
|----------------------------------------------------------------------------|
56. | Datsun 200 Datsun 200 23 23 6,229 6,229 |
57. | Datsun 210 35 35 4,589 4,589 |
58. | Datsun 510 Datsun 510 24 . 5,079 5,079 |
59. | Datsun 810 21 21 8,129 . |
60. | Fiat Strada Fiat Strada 21 21 4,296 4,296 |
|----------------------------------------------------------------------------|
61. | Honda Accord 25 25 5,799 . |
62. | Honda Civic 28 28 4,499 . |
63. | Mazda GLC Mazda GLC 30 30 3,995 3,995 |
64. | Peugeot 604 Peugeot 604 14 14 12,990 12,990 |
65. | Renault Le Car 26 . 3,895 . |
|----------------------------------------------------------------------------|
66. | Subaru Subaru 35 35 3,798 3,798 |
67. | Toyota Celica 18 . 5,899 5,899 |
68. | Toyota Corolla Toyota Corolla 31 31 3,748 3,748 |
69. | Toyota Corona Toyota Corona 18 18 5,719 5,719 |
70. | VW Dasher VW Dasher 23 23 7,140 7,140 |
|----------------------------------------------------------------------------|
71. | VW Diesel VW Diesel 41 41 5,397 5,397 |
72. | VW Rabbit 25 . 4,697 . |
73. | VW Scirocco 25 . 6,850 . |
74. | Volvo 260 17 17 11,995 11,995 |
+----------------------------------------------------------------------------+
.
. ///--- End Log and to HTML
> log close _all
name: drop_random_subset
log: C:\Users\fan/Stata4Econ//rand/basic/fs_droprand.smcl
log type: smcl
closed on: 6 Oct 2019, 18:50:13
-----------------------------------------------------------------------------------------------------------------------------