04_Apply_US_Crime_Rates

데이터의 가능성/pandas

04_Apply_US_Crime_Rates_Exercises

gamnyam 2024. 6. 24. 10:05

1. 필요한 librarie를 import 하세요.

import pandas as pd

2. 다음 주소로부터 Data를 import 하세요.

url = "https://raw.githubusercontent.com/myoh0623/dataset/main/US_Crime_Rates_1960_2014.csv"

3. crime 변수에 DataFrame을 할당 하세요.

crime = pd.read_csv(url)

4. column의 data type은 무엇입니까?

crime.dtypes

# crime.info()

5. column Year의 dtype을 datetime64로 변경하세요. (to_datetime 을 사용)

crime.Year.dtype

dtype('int64')

crime['Year'] = pd.to_datetime(crime['Year'], format='%Y')
crime.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 55 entries, 0 to 54
Data columns (total 12 columns):
 #   Column              Non-Null Count  Dtype         
---  ------              --------------  -----         
 0   Year                55 non-null     datetime64[ns]
 1   Population          55 non-null     int64         
 2   Total               55 non-null     int64         
 3   Violent             55 non-null     int64         
 4   Property            55 non-null     int64         
 5   Murder              55 non-null     int64         
 6   Forcible_Rape       55 non-null     int64         
 7   Robbery             55 non-null     int64         
 8   Aggravated_assault  55 non-null     int64         
 9   Burglary            55 non-null     int64         
 10  Larceny_Theft       55 non-null     int64         
 11  Vehicle_Theft       55 non-null     int64         
dtypes: datetime64[ns](1), int64(11)
memory usage: 5.3 KB

6. Year column을 dataframe의 index로 설정하세요. 기존의 index는 삭제합니다.

crime.set_index('Year', drop = True, inplace=True)

*set_index

*drop = True : 기존의 index 삭제

7. Total column을 삭제하세요.

del crime["Total"]

cf)

crime.drop(["Total"], axis = 1)

8. year를 10년 단위로 그룹화하고 값의 합계를 구하세요.

crime_per_year = crime.resample('10YS').sum().copy()

Time series / date functionality — pandas 2.2.2 documentation (pydata.org)

Time series / date functionality — pandas 2.2.2 documentation

Time series / date functionality pandas contains extensive capabilities and features for working with time series data for all domains. Using the NumPy datetime64 and timedelta64 dtypes, pandas has consolidated a large number of features from other Python

pandas.pydata.org

9. 미국에서 살기 가장 위험한(살인이 많이 일어난) 10년은 언제 인가요?

crime_per_year.idxmax()["Murder"]

* idxmax / idxmin : 최대/최소값이 포함된 행/열

'데이터의 가능성 > pandas' 카테고리의 다른 글

05_Time_Series_Investor_Flow_of_Funds_US (0)	2024.06.24
05_Time_Series_Apple_Stock_Exercises (0)	2024.06.24
04_Apply_Students_Alcohol_Consumption_Exercises (0)	2024.06.24
03_Grouping_Regiment_Exercises (0)	2024.06.20
03_Grouping_Alcohol_Consumption (0)	2024.06.18

현재글04_Apply_US_Crime_Rates_Exercises

감냠이의 한 걸음

감냠이의 뭐라도 되어가는 하루하루

#pandas #python, 금리와인플레이션 #서브프라임모기지사태 #엔캐리트레이드, 금융데이터분석가양성과정, swapcase #파이썬, python #프로그래머스, python #pandas, pandas연습문제, 서울경제진흥원 #새싹금융데이터분석가양성과정, pandas, 영등포청년취업사관학교, 매일코딩, 오블완, 프로그래머스 #코딩 #파이썬 #비전공자코딩연습, 티스토리챌린지, 프로그래머스레벨0, 특수문자r #매일코딩, 새싹면접후기 #새싹사전테스트 #sba #, 청년취업사관학교새싹, pandas예제 #python, pandas #python,

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

감냠이의 한 걸음