Datetimes Practice#

Part 1#

Dates and Datetimes#

  1. Create a date() object representing your birthday. Assign it to a variable and use the variable to print out your birthyear.

  1. Explain why the following code returns an error:

date(2011, 2, 29)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 date(2011, 2, 29)

NameError: name 'date' is not defined
  1. Create a datetime object for the 6:23pm on March 13th 1998

  1. Create a list with 3 datetime objects in it

Creating Dates and Datetimes from Strings#

  1. Create a datetime object from the following string

date_string = '01-22-2009 21:00'
  1. Create a datetime object from the following string

date_string = 'Jul 23 1998 8:02:00'
  1. Create a datetime object from the following string

date_string = '12/1/72 01 52 12'
  1. Convert the date August 29th 2008 to the date in Julian days

Timedeltas#

  1. Calculate how many days it has been since your last birthday.

  1. Calculate exactly how old you are, down to the hour, right now.

Part 2#

import pandas as pd
was_2020_filepath = "../data/SARP 2020 final.xlsx"
was_2020 = pd.read_excel(was_2020_filepath, "INPUT", skipfooter=7)

Question 1#

Using datetime object calculate how long the data record was_2020 is. In other words, how much time passed between the first and the last measurement in this sample list?

Question 2#

Creating a datetime column from our was_2020 Dataframe

A) In the was_2020 dataset Date and Time are in two seperate columns. Combine the two dataframes into one and assign the output to a new variable called combined_datetime.

To do this you will need to:

  1. Convert each column to a string type

  2. Use concatenation to combine them

# Example of string concatenation
'hello ' + 'there'

B) Now that you have a 'combined_datetime' variable, you can use the pandas function pd.to_datetime() to convert it from a string to a Series of datetime objects. Create a new column in your dataframe called 'datetime' for the new datetime objects.

C) Delete the old 'Date' and 'Time' columns with the DATAFRAME.drop() method.

Question 3#

Filtering our dataframe to include only the rows within 7 days of our target date

import numpy as np

A) Let’s say that we are interested in a phenomena that occurred on July 5th, 2020 so we want to narrow down our dataframe to inclue only the observations that occured within a week of the 5th.

Start by calculating the difference between each date in the ‘datetime’ column and July 5th, 2020. What type of object is returned in the result?

B) Use the calculation from part A and write a conditional statement checking if each of the rows occured within 7 days of the 5th. Don’t forget to include dates of samples both before and after the 5th.

C) Use the boolean series from part B as a filter to output the was_2020 dataframe with only the rows within 7 days of July 5th, 2020.

Question 4#

# Read in the data
water_vars = pd.read_csv('../data/englewood_3_12_21_usgs_water.tsv', sep='\t', skiprows=30)
# There are a lot of variables here, so let's shorten our dataframe to a few variables
water_vars = water_vars[['datetime', '210920_00060', '210922_00010', '210924_00300', '210925_00400']]
# Get rid of the first row of hard-coded datatype info
water_vars = water_vars.drop(0)
# Rename the columns from their USGS codes to more human-readible names
name_codes = {'210920_00060': 'discharge','210922_00010': 'temperature', '210924_00300': 'dissolved oxygen', '210925_00400': 'pH'}
water_vars = water_vars.rename(columns=name_codes)
# Convert columns with numbers to a numeric type
water_vars['discharge'] = pd.to_numeric(water_vars['discharge'])
water_vars['temperature'] = pd.to_numeric(water_vars['temperature'])
water_vars['dissolved oxygen'] = pd.to_numeric(water_vars['dissolved oxygen'])
water_vars['pH'] = pd.to_numeric(water_vars['pH'])
water_vars

A) Convert the ‘datetime’ string column to a column of datetime objects using pd.to_datetime().

B) Set the new datetime column as the index of the dataframe.

C) Use the new index to retrieve the value for '2021-03-12 13:30:00'

D) One cool thing we can do when we have a datetime index is easily resample the data. Resampling is when we aggregate more finely resolved data to be more coarsely resolved. In this example we will be taking data that is reported every 15 minutes and resampling to an hourly resolution.

Use the DATAFRAME.resample() function to resample to hourly resolution using the mean value of the 15 minute intervals. Check out the docs page or the pandas datetime overview for examples.