Python clean_up 예제들

프로그래밍 언어: Python

네임스페이스/패키지 이름: numbeo_scraper

메소드/함수: clean_up

hotexamples.com에서의 예제들: 4

Python clean_up - 4개의 예제가 발견되었습니다. 이것들은 오픈소스 프로젝트에서 추출된 Python의 numbeo_scraper.clean_up에 대한 실세계 최고 등급의 예제들입니다. 예제들을 평가하여 예제의 품질 향상에 도움을 줄 수 있습니다.

예제 #1

파일 보기

파일: main.py 프로젝트: jjdblast/CityScapes

raw_bea = gbd.get_bea_data('http://www.bea.gov/newsreleases/regional/gdp_metro/2015/xls/gdp_metro0915.xls')
bea_df = gbd.clean_me(raw_bea)
bea_df = bea_df[:-2]
next_df = pd.concat([new_df, bea_df[bea_df['bea_2014'] > 20000]], axis=1)
print 'Bureau of Economic Affairs data merged!'
# incorporate numbeo data:

url_prefix = 'http://www.numbeo.com/cost-of-living/region_rankings.jsp?title='
url_suffix = '&region=021'
year_list = ['2009', '2010', '2011', '2012', '2013', '2014', '2015', '2016']

urls = ns.build_urls(year_list)
for url in urls:
    soup_can = ns.get_pages(url)
table_list = [ns.clean_up(soup) for soup in soup_can]
zipped = list(zip(year_list, table_list))
df_dict = ns.build_data_frames(zipped)

for item in year_list:
    columns= ns.fix_em(['Rank','City','Cost of Living Index','Rent Index','Cost of Living Plus Rent Index',
          'Groceries Index','Restaurant Price Index','Local Purchasing Power Index'])
    first_cols = columns[:2]
    first_cols.extend([column + '_{}'.format(item)for column in columns[2:]])
    df_dict[item].columns = first_cols

def clean_up_df(df):
    df['state'] = df['city'].apply(lambda x: x.split(',')[1].strip().lower().replace(' ', '_'))
    df['city'] = df['city'].apply(lambda x: x.split(',')[0].lower().replace(' ', '_'))
    del df['rank']
    return df

예제 #2

파일 보기

파일: main.py 프로젝트: CLuiz/CityScapes

raw_bea = gbd.get_bea_data("http://www.bea.gov/newsreleases/regional/gdp_metro/2015/xls/gdp_metro0915.xls")
bea_df = gbd.clean_me(raw_bea)
bea_df = bea_df[:-2]
next_df = pd.concat([new_df, bea_df[bea_df["bea_2014"] > 20000]], axis=1)
print "Bureau of Economic Affairs data merged!"
# incorporate numbeo data:

url_prefix = "http://www.numbeo.com/cost-of-living/region_rankings.jsp?title="
url_suffix = "&region=021"
year_list = ["2009", "2010", "2011", "2012", "2013", "2014", "2015", "2016"]

urls = ns.build_urls(year_list)
for url in urls:
    soup_can = ns.get_pages(url)
table_list = [ns.clean_up(soup) for soup in soup_can]
zipped = list(zip(year_list, table_list))
df_dict = ns.build_data_frames(zipped)

for item in year_list:
    columns = ns.fix_em(
        [
            "Rank",
            "City",
            "Cost of Living Index",
            "Rent Index",
            "Cost of Living Plus Rent Index",
            "Groceries Index",
            "Restaurant Price Index",
            "Local Purchasing Power Index",
        ]

예제 #3

파일 보기

파일: walkscore.py 프로젝트: jjdblast/CityScapes

def get_walk_data(url):
    doc = requests.get(url).text
    soup = BeautifulSoup(doc, 'lxml')
    return clean_up(soup)

예제 #4

파일 보기

파일: walkscore.py 프로젝트: CLuiz/CityScapes

def get_walk_data(url):
    doc = requests.get(url).text
    soup = BeautifulSoup(doc, 'lxml')
    return clean_up(soup)