Stata Drop a Random Subset of Observations (DO, more see: Fan and Stata4Econ)

-----------------------------------------------------------------------------------------------------------------------------
      name:  drop_random_subset
       log:  C:\Users\fan/Stata4Econ//rand/basic/fs_droprand.smcl
  log type:  smcl
 opened on:   6 Oct 2019, 18:50:13


. log on $st_logname (log already on)

. . ///-- Site Link: Fan's Project Reusable Stata Codes Table of Content > di "https://fanwangecon.github.io/" https://fanwangecon.github.io/

. di "https://fanwangecon.github.io/Stata4Econ/" https://fanwangecon.github.io/Stata4Econ/

. . ///-- File Title > global filetitle "Stata Drop a Random Subset of Observations"

. . ///--- Load Data > set more off

. sysuse auto, clear (1978 Automobile Data)

. . ///--- Generating Index for Dropping > set seed 987

. scalar it_drop_frac = 3

. gen row_idx_it = round((_n/_N)*it_drop_frac)

. gen row_idx_rand = round(it_drop_frac*uniform())

. . //--- drop when row_idx_it == row_idx_rand, if it_drop_frac set at 3 . list make price mpg row_idx_it row_idx_rand, ab(20)

+--------------------------------------------------------------+ | make price mpg row_idx_it row_idx_rand | |--------------------------------------------------------------| 1. | AMC Concord 4,099 22 0 0 | 2. | AMC Pacer 4,749 17 0 2 | 3. | AMC Spirit 3,799 22 0 1 | 4. | Buick Century 4,816 20 0 3 | 5. | Buick Electra 7,827 15 0 1 | |--------------------------------------------------------------| 6. | Buick LeSabre 5,788 18 0 1 | 7. | Buick Opel 4,453 26 0 2 | 8. | Buick Regal 5,189 20 0 2 | 9. | Buick Riviera 10,372 16 0 1 | 10. | Buick Skylark 4,082 19 0 2 | |--------------------------------------------------------------| 11. | Cad. Deville 11,385 14 0 1 | 12. | Cad. Eldorado 14,500 14 0 2 | 13. | Cad. Seville 15,906 21 1 1 | 14. | Chev. Chevette 3,299 29 1 2 | 15. | Chev. Impala 5,705 16 1 1 | |--------------------------------------------------------------| 16. | Chev. Malibu 4,504 22 1 2 | 17. | Chev. Monte Carlo 5,104 22 1 2 | 18. | Chev. Monza 3,667 24 1 2 | 19. | Chev. Nova 3,955 19 1 3 | 20. | Dodge Colt 3,984 30 1 1 | |--------------------------------------------------------------| 21. | Dodge Diplomat 4,010 18 1 2 | 22. | Dodge Magnum 5,886 16 1 2 | 23. | Dodge St. Regis 6,342 17 1 1 | 24. | Ford Fiesta 4,389 28 1 0 | 25. | Ford Mustang 4,187 21 1 1 | |--------------------------------------------------------------| 26. | Linc. Continental 11,497 12 1 3 | 27. | Linc. Mark V 13,594 12 1 2 | 28. | Linc. Versailles 13,466 14 1 2 | 29. | Merc. Bobcat 3,829 22 1 3 | 30. | Merc. Cougar 5,379 14 1 2 | |--------------------------------------------------------------| 31. | Merc. Marquis 6,165 15 1 1 | 32. | Merc. Monarch 4,516 18 1 3 | 33. | Merc. XR-7 6,303 14 1 2 | 34. | Merc. Zephyr 3,291 20 1 1 | 35. | Olds 98 8,814 21 1 1 | |--------------------------------------------------------------| 36. | Olds Cutl Supr 5,172 19 1 1 | 37. | Olds Cutlass 4,733 19 2 1 | 38. | Olds Delta 88 4,890 18 2 1 | 39. | Olds Omega 4,181 19 2 1 | 40. | Olds Starfire 4,195 24 2 1 | |--------------------------------------------------------------| 41. | Olds Toronado 10,371 16 2 3 | 42. | Plym. Arrow 4,647 28 2 1 | 43. | Plym. Champ 4,425 34 2 0 | 44. | Plym. Horizon 4,482 25 2 1 | 45. | Plym. Sapporo 6,486 26 2 2 | |--------------------------------------------------------------| 46. | Plym. Volare 4,060 18 2 1 | 47. | Pont. Catalina 5,798 18 2 2 | 48. | Pont. Firebird 4,934 18 2 1 | 49. | Pont. Grand Prix 5,222 19 2 1 | 50. | Pont. Le Mans 4,723 19 2 2 | |--------------------------------------------------------------| 51. | Pont. Phoenix 4,424 19 2 1 | 52. | Pont. Sunbird 4,172 24 2 2 | 53. | Audi 5000 9,690 17 2 3 | 54. | Audi Fox 6,295 23 2 2 | 55. | BMW 320i 9,735 25 2 2 | |--------------------------------------------------------------| 56. | Datsun 200 6,229 23 2 1 | 57. | Datsun 210 4,589 35 2 3 | 58. | Datsun 510 5,079 24 2 2 | 59. | Datsun 810 8,129 21 2 3 | 60. | Fiat Strada 4,296 21 2 0 | |--------------------------------------------------------------| 61. | Honda Accord 5,799 25 2 3 | 62. | Honda Civic 4,499 28 3 2 | 63. | Mazda GLC 3,995 30 3 1 | 64. | Peugeot 604 12,990 14 3 2 | 65. | Renault Le Car 3,895 26 3 3 | |--------------------------------------------------------------| 66. | Subaru 3,798 35 3 1 | 67. | Toyota Celica 5,899 18 3 3 | 68. | Toyota Corolla 3,748 31 3 1 | 69. | Toyota Corona 5,719 18 3 0 | 70. | VW Dasher 7,140 23 3 1 | |--------------------------------------------------------------| 71. | VW Diesel 5,397 41 3 0 | 72. | VW Rabbit 4,697 25 3 3 | 73. | VW Scirocco 6,850 25 3 3 | 74. | Volvo 260 11,995 17 3 2 | +--------------------------------------------------------------+

. . ///--- Drop approximately 1/2 of make randomly > set seed 987

. scalar it_drop_frac = 2

. clonevar make_wth_mimssing = make

. replace make_wth_mimssing = "" if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform()) (33 real changes made)

. . ///--- Drop approximately 1/3 of mpg randomly > set seed 987

. scalar it_drop_frac = 3

. clonevar mpg_wth_mimssing = mpg

. replace mpg_wth_mimssing =. if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform()) (21 real changes made, 21 to missing)

. . ///--- Drop approximately 1/5 of mpg randomly > set seed 987

. scalar it_drop_frac = 5

. clonevar price_wth_mimssing = price

. replace price_wth_mimssing =. if round((_n/_N)*it_drop_frac) == round(it_drop_frac*uniform()) (16 real changes made, 16 to missing)

. . ///--- Summarize > codebook make*

----------------------------------------------------------------------------------------------------------------------------- make Make and Model -----------------------------------------------------------------------------------------------------------------------------

type: string (str18), but longest is str17

unique values: 74 missing "": 0/74

examples: "Cad. Deville" "Dodge Magnum" "Merc. XR-7" "Pont. Catalina"

warning: variable has embedded blanks

----------------------------------------------------------------------------------------------------------------------------- make_wth_mimssing Make and Model -----------------------------------------------------------------------------------------------------------------------------

type: string (str18), but longest is str17

unique values: 41 missing "": 33/74

examples: "" "" "Cad. Deville" "Mazda GLC"

warning: variable has embedded blanks

. summ mpg* price*

Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- mpg | 74 21.2973 5.785503 12 41 mpg_wth_mi~g | 53 21.15094 6.41931 12 41 price | 74 6165.257 2949.496 3291 15906 price_wth_~g | 58 6254.069 3219.379 3291 15906

. list make* mpg* price*

+----------------------------------------------------------------------------+ | make make_wth_mimssing mpg mpg_wt~g price price_~g | |----------------------------------------------------------------------------| 1. | AMC Concord 22 . 4,099 . | 2. | AMC Pacer AMC Pacer 17 17 4,749 4,749 | 3. | AMC Spirit AMC Spirit 22 22 3,799 3,799 | 4. | Buick Century Buick Century 20 20 4,816 4,816 | 5. | Buick Electra Buick Electra 15 15 7,827 7,827 | |----------------------------------------------------------------------------| 6. | Buick LeSabre Buick LeSabre 18 18 5,788 5,788 | 7. | Buick Opel Buick Opel 26 26 4,453 4,453 | 8. | Buick Regal Buick Regal 20 20 5,189 5,189 | 9. | Buick Riviera Buick Riviera 16 16 10,372 . | 10. | Buick Skylark Buick Skylark 19 19 4,082 4,082 | |----------------------------------------------------------------------------| 11. | Cad. Deville Cad. Deville 14 14 11,385 11,385 | 12. | Cad. Eldorado Cad. Eldorado 14 14 14,500 14,500 | 13. | Cad. Seville Cad. Seville 21 . 15,906 15,906 | 14. | Chev. Chevette Chev. Chevette 29 29 3,299 3,299 | 15. | Chev. Impala Chev. Impala 16 . 5,705 5,705 | |----------------------------------------------------------------------------| 16. | Chev. Malibu Chev. Malibu 22 22 4,504 4,504 | 17. | Chev. Monte Carlo Chev. Monte Carlo 22 22 5,104 5,104 | 18. | Chev. Monza Chev. Monza 24 24 3,667 3,667 | 19. | Chev. Nova Chev. Nova 19 19 3,955 3,955 | 20. | Dodge Colt 30 . 3,984 3,984 | |----------------------------------------------------------------------------| 21. | Dodge Diplomat 18 18 4,010 4,010 | 22. | Dodge Magnum 16 16 5,886 5,886 | 23. | Dodge St. Regis 17 . 6,342 . | 24. | Ford Fiesta Ford Fiesta 28 28 4,389 4,389 | 25. | Ford Mustang Ford Mustang 21 . 4,187 4,187 | |----------------------------------------------------------------------------| 26. | Linc. Continental Linc. Continental 12 12 11,497 11,497 | 27. | Linc. Mark V 12 12 13,594 13,594 | 28. | Linc. Versailles 14 14 13,466 13,466 | 29. | Merc. Bobcat Merc. Bobcat 22 22 3,829 3,829 | 30. | Merc. Cougar 14 14 5,379 5,379 | |----------------------------------------------------------------------------| 31. | Merc. Marquis 15 . 6,165 . | 32. | Merc. Monarch Merc. Monarch 18 18 4,516 4,516 | 33. | Merc. XR-7 14 14 6,303 6,303 | 34. | Merc. Zephyr Merc. Zephyr 20 . 3,291 3,291 | 35. | Olds 98 Olds 98 21 . 8,814 8,814 | |----------------------------------------------------------------------------| 36. | Olds Cutl Supr 19 . 5,172 . | 37. | Olds Cutlass Olds Cutlass 19 19 4,733 4,733 | 38. | Olds Delta 88 18 18 4,890 4,890 | 39. | Olds Omega 19 19 4,181 4,181 | 40. | Olds Starfire 24 24 4,195 4,195 | |----------------------------------------------------------------------------| 41. | Olds Toronado Olds Toronado 16 16 10,371 10,371 | 42. | Plym. Arrow Plym. Arrow 28 28 4,647 4,647 | 43. | Plym. Champ Plym. Champ 34 34 4,425 4,425 | 44. | Plym. Horizon 25 25 4,482 4,482 | 45. | Plym. Sapporo 26 . 6,486 . | |----------------------------------------------------------------------------| 46. | Plym. Volare 18 18 4,060 4,060 | 47. | Pont. Catalina 18 . 5,798 . | 48. | Pont. Firebird 18 18 4,934 4,934 | 49. | Pont. Grand Prix 19 19 5,222 5,222 | 50. | Pont. Le Mans 19 . 4,723 . | |----------------------------------------------------------------------------| 51. | Pont. Phoenix Pont. Phoenix 19 19 4,424 4,424 | 52. | Pont. Sunbird 24 . 4,172 . | 53. | Audi 5000 Audi 5000 17 17 9,690 9,690 | 54. | Audi Fox 23 . 6,295 . | 55. | BMW 320i 25 . 9,735 9,735 | |----------------------------------------------------------------------------| 56. | Datsun 200 Datsun 200 23 23 6,229 6,229 | 57. | Datsun 210 35 35 4,589 4,589 | 58. | Datsun 510 Datsun 510 24 . 5,079 5,079 | 59. | Datsun 810 21 21 8,129 . | 60. | Fiat Strada Fiat Strada 21 21 4,296 4,296 | |----------------------------------------------------------------------------| 61. | Honda Accord 25 25 5,799 . | 62. | Honda Civic 28 28 4,499 . | 63. | Mazda GLC Mazda GLC 30 30 3,995 3,995 | 64. | Peugeot 604 Peugeot 604 14 14 12,990 12,990 | 65. | Renault Le Car 26 . 3,895 . | |----------------------------------------------------------------------------| 66. | Subaru Subaru 35 35 3,798 3,798 | 67. | Toyota Celica 18 . 5,899 5,899 | 68. | Toyota Corolla Toyota Corolla 31 31 3,748 3,748 | 69. | Toyota Corona Toyota Corona 18 18 5,719 5,719 | 70. | VW Dasher VW Dasher 23 23 7,140 7,140 | |----------------------------------------------------------------------------| 71. | VW Diesel VW Diesel 41 41 5,397 5,397 | 72. | VW Rabbit 25 . 4,697 . | 73. | VW Scirocco 25 . 6,850 . | 74. | Volvo 260 17 17 11,995 11,995 | +----------------------------------------------------------------------------+

. . ///--- End Log and to HTML > log close _all name: drop_random_subset log: C:\Users\fan/Stata4Econ//rand/basic/fs_droprand.smcl log type: smcl closed on: 6 Oct 2019, 18:50:13 -----------------------------------------------------------------------------------------------------------------------------