SciTeens Online: Data Science Curriculums

By Shang Chen
March 10, 2021 · 2 minute read

Hello everyone, this week’s article will be a primer of sorts for the upcoming SciTeens Online Data Science curriculums. If you haven’t heard already, SciTeens Online is a week-long data science program that gives high school students the skills necessary to conduct advanced research like data exploration techniques, plotting, statistical testing, and data sorting. For students that want to learn more about Data Science as a major, check out the TL;DR Majors post on Data Science here.

To begin our journey into data science, we’ll cover some basics like importing a dataset and creating visualizations. For a relevant dataset, we'll use open-source data about some statistics regarding COVID-19 from The Covid Tracking Project by The Atlantic. Now how do we get this dataset from the website and how might we go about analyzing this? The first step would be to read in our data and take a look at the first couple lines of it: 

We’re not going to get too in-depth here in this article, but there are a variety of different sorts of analysis we could do with this dataset. We can have pandas generate some basic statistics over these columns with the .describe() command. The describe command generates a variety of interesting statistics including the mean, median, and even standard deviations of our data. If you're wondering what other sort of commands you can call, simply google pandas documentation for a complete list of the different commands and functions available for data analysis.

Since this is still a primer, we’ll keep it simple in this article and analyze the number of positive and negative cases over time from the beginning of our data set (March 2020) to the end (Jan. 2021). Let’s clean up our data to keep only the columns we want information from and filter it for points where the dataQualityGrade was an 'A'. Now let's create a line plot of negative and positive cases for the course of our data set:

Don’t worry if you’re not completely sure about some of the commands we ran to create the graphs or filter the data. Our curriculum covers most of the basic plotting and visualization techniques you will need to do this sort of basic analysis of data for a dataset of your choosing and will give you resources to explore datasets in interesting ways.

You’ve just learned how to read in a file from a website, perform some basic exploratory analysis, and filter the data to create valuable visualizations. For more techniques and a deeper understanding of how to further break down datasets, be sure to take a look at SciTeens’ free online curriculum and check out the www.SciTeens.org website for more resources!

Did you enjoy this article?

About The Author

Shang Chen is on the executive team of SciTeens and is studying Data Science and Economics at UC Berkeley. His hobbies include working out, cooking, and playing video games. Feel free to reach out to him with comments, questions, and future article recommendations at Shang@SciTeens.org.

More on this topic...

TL;DR Science: Oncogenes

Perhaps you may have heard someone say at one point “I am at risk for ___ cancer, so I have to take extra precautions…”. Or, you may have heard the claim that “Sunscreen can reduce your risk of getting skin cancer”. Two questions consequently arise: first, how do we actually know of cancer risk; second, what does it mean to be at risk? To start we have to look at our cells’ DNA and what can happen if it becomes mutated. 

TL;DR Resources: Your Mini Guide for College Preparations

It’s almost the end of the school year, and we can all agree that everyone is anxiously trying to figure out what they want to pursue as a future major for the next couple of years and which college they should choose. Nevertheless, they shouldn't decide that whilst having little knowledge and experience. Lucky for you, this guide is here to help you.

TL;DR Speaker Series: Biomedical Engineering: A Multipotential STEM Major

Introducing our new speaker series only at Sciteens! Starting off the series we have Jana Al Hinnawi and her experience as a biomedical engineering major.

TL;DR Science SI Units of Measure and the metric system

You may be familiar with many of the measures called the metric system, customary system, but what do you know about the inner workings of the measures used by the scientific community? Check out this week's article to learn more!

TL;DR Science: The Greenhouse Gases

As climate change worsens, sea levels rise, and global temperatures increase, greenhouse gases have become a common topic of discussion. But what are they really? What makes them greenhouse gases, and how could this knowledge be useful in reducing their influence on the greenhouse effect? This week, I’ll be answering all of these questions in part 2 of a 3-part series on the greenhouse effect and greenhouse gases.