0

我正在使用tidycensus该国所有人口普查区的三个不同年份(十年一次的 2000、ACS 2009-2013、ACS 2015-2019)提取论文数据。

根据 kyle walker 的教程,我已经能够使用 map_df 函数来创建下面的调用,这很有效。结果是一个数据框,它为该国每个人口普查区的向量中列出的所有变量提取数据:

# get vector of state fips codes for US
us <- unique(fips_codes$state)[1:51]



# select my variables
my_vars19 <- c(pop = "B01003_001", 
               racetot = "B03002_001", 
               nhtot = "B03002_002", 
               nhwht = "B02001_002", 
               nhblk = "B02001_003", 
               nhnat = "B02001_004", 
               nhasian = "B02001_005", 
               nhpac = "B02001_006", 
               nhother = "B02001_007",
               nhtwo = "B02001_008", 
               hisp = "B03003_003",             
               male = "B01001_002",
               female = "B01001_026")



# function call to obtain tracts for US
acs2019 <- map_df(us, function(x) {
           get_acs(geography = "tract", 
                variables = my_vars19, 
                state = x)
})

glimpse(acs2019)

Rows: 949,728
Columns: 5
$ GEOID    <chr> "01001020100", "01001020100", "01001020100", "01001020100", "01001020100", "01001020100", "…
$ NAME     <chr> "Census Tract 201, Autauga County, Alabama", "Census Tract 201, Autauga County, Alabama", "…
$ variable <chr> "male", "female", "pop", "nhwht", "nhblk", "nhnat", "nhasian", "nhpac", "nhother", "nhtwo",…
$ estimate <dbl> 907, 1086, 1993, 1685, 152, 0, 2, 0, 0, 154, 1993, 1967, 26, 1058, 901, 1959, 759, 1117, 0,…
$ moe      <dbl> 118, 178, 225, 202, 78, 12, 5, 12, 12, 120, 225, 226, 36, 137, 133, 202, 113, 180, 12, 12, …

不过,这只是一个练习电话。我需要为每一年的分析(所以 2000、2009-2013 和 2015-2019)提取接近 150 到 200 个变量。我担心为这么多州和人口普查区提取这么多变量会对 API 造成很大负担。另外,我认为一次可以提取的变量数量是有限制的。

我可以按变量类型对调用进行分组,但我担心将调用分组可能会变得笨拙。而且我还需要将它们组合在一起。我想知道标准做法是使用创建大型数据集tidycensus

人们通常会打断电话还是只是打电话给桌子?或者有没有比我概述的更有效的系统。我知道大多数人通常使用tidycensus拉一些 var,但是当他们需要拉很多时他们会怎么做?

4

0 回答 0