Working with CSV files
- 라임 샹큼
- Mar 26
- 2 min read
I worked with txt files and py files the most but I extended my reach to csv files for data organization and learned about the pandas extension that helps a lot with organizing information.
data = pandas.read_csv('2018_Central_Park_Squirrel_Census_-_Squirrel_Data.csv')
squirrel_data = data.to_dict()
squirrel_color_dict = squirrel_data['Primary Fur Color']
squirrel_colors = []
squirrel_color_count = {}
for color in squirrel_color_dict.values():
if color not in squirrel_colors:
squirrel_colors.append(color)
for color in squirrel_colors:
if color != 'NaN':
if color not in squirrel_color_count.keys():
squirrel_color_count[color] = 0
for squirrel in squirrel_color_dict.values():
squirrel_color_count[squirrel] += 1
squirrels = {'fur colors':list(squirrel_color_count.keys()),
'number of':list(squirrel_color_count.values())}
print(squirrels)
print(pandas.DataFrame(squirrels))
With a piece of data that recorded the information of squirrels in New York parks, I tried to generate a code that would organize just the fur color of the squirrels. Although this code worked, it gave me information I didn't need and was too long and wordy. I didn't use the functions of the pandas library well either.
data = pandas.read_csv('2018_Central_Park_Squirrel_Census_-_Squirrel_Data.csv')
gray_squirrel_count = len(data[data['Primary Fur Color']=='Gray'])
red_squirrel_count = len(data[data['Primary Fur Color']=='Cinnamon'])
black_squirrel_count = len(data[data['Primary Fur Color']=='Black'])
Squirrel_data = {'Fur color':['Gray','Red','Black'],
'Number':[gray_squirrel_count,red_squirrel_count,black_squirrel_count]} #make it yourself
sql_dta = pandas.DataFrame(Squirrel_data)
print(sql_dta)
sql_dta.to_csv('Squirrel_fur_color.csv')
I thought being able to make the code ready for any amount of change in the information was the best code but I realized if I'm the one the read the code and the code has little information, I could just adjust and set the variables as I pleased. The code above assumes I know the categories in which the information is organized.
The harder, confusing part of this challenge was that I kept forgetting how to use the data[data[] ==_].
It really just means within the data, you have to find the column where the data of a column matches that column name.
Things I learned
Can create a list specifically for csv files by importing csv
Can list up all rows of a csv file with csv.reader(file_name)
import csv weather_list = []with open('weather_data.csv') as data: read_data = csv.reader(data) for row in read_data: weather_list.append(row)print(weather_list)
can read file in one short code using pandas library:
import pandas pandas.read_csv('weather_data.csv')
can create a data table from dictionary by using DataFrame function preinstalled in pandas
food = {'food':['grapes','berries','apples'], 'price':[20, 10, 45]}food_data = pandas.DataFrame(food)print(food_data)
can create a new csv file with pandas
food_data.to_csv('food_data.csv')
Comments