This page was generated from docs/source/notebooks/1) pyLEnM - Basics.ipynb. Interactive online version: .
Case 1 - Basics¶
Welcome to the demonstration notebook where we’ll go over all of the Basic functions in the pylenm package! Let’s get started!
Installing, Importing, and Understanding the package¶
Make sure to install pylenm from https://pypi.org/project/pylenm/ by running pip install pylenm
in your environment terminal. Once completed, you should be able to import the package. Note: to update to the latest version of pylenm run: pip install pylenm --upgrade
[1]:
# pip install pylenm
[2]:
# Import our packages
import pylenm
import pandas as pd
import matplotlib.pyplot as plt
pd.set_option('display.max_rows', 100) # Display a custom number of rows for dataframe
We can verify the package version by typing: pylenm_df.__version__
[3]:
pylenm.__version__
[3]:
'0.2'
The pylenm_df package has one modules: PylenmDataFactory: where all of the package functionality is imported.
Let’s import it!
[4]:
from pylenm import PylenmDataFactory
Using pylenm¶
Now let’s get our hands dirty and start exploring the PylenmDataFactory module.
First, import our two datasets here. It’s located in the same directory in a folder called data
. pylenm_df takes in two datasets: 1. Concentration Data: Dataset containing the concetration values 2. Construction Data: Dataset containing the construction information of locations such as the Longitude and Latitude values.
[5]:
url_1 = 'https://raw.githubusercontent.com/ALTEMIS-DOE/pylenm/master/notebooks/data/FASB_Data_thru_3Q2015_Reduced_Demo.csv'
url_2 = 'https://github.com/ALTEMIS-DOE/pylenm/blob/master/notebooks/data/FASB%20Well%20Construction%20Info.xlsx?raw=true'
concentration_data = pd.read_csv(url_1)
construction_data = pd.read_excel(url_2)
Now that we have our data imported let’s save the data into our functions class.
We do this by initializing PylenmDataFactory with our data. We’ll call the instance pylenm_df
[6]:
# Create instance
pylenm_df = PylenmDataFactory(concentration_data) # Save concentration data
pylenm_df.setConstructionData(construction_data) # Save construction data
Successfully imported the data!
Successfully imported the construction data!
If the data passed doesn’t meet the package requiremnts, you’ll be presented with the following message.
To demonstrate the requirments we’ll pass an arbitrary list instead of a dataframe.
[7]:
PylenmDataFactory([1,2,3,4,5])
ERROR: Make sure the data is a pandas DataFrame.
PYLENM DATA REQUIREMENTS:
The imported data needs to meet ALL of the following conditions to have a successful import:
1) Data should be a pandas dataframe.
2) Data must have these column names:
['COLLECTION_DATE','STATION_ID','ANALYTE_NAME','RESULT','RESULT_UNITS']
[7]:
<pylenm.pylenm.PylenmDataFactory at 0x112f514c0>
We can view the concentration data we imported by running the getData()
function.
[8]:
pylenm_df.getData().head()
[8]:
ID | FACILITY | FFA_INDEX_NO | SRS_AREA | OPERABLE_UNIT | STATION_ID | STATION_TYPE | SEIR_ID | WELL_USE | REVISION_ID | ... | RESULT_UNITS | COUNTING_UNCERTAINTY | MATRIX_CODE | VALIDATION_STATUS | FIELD_QC_CODE | LAB_QC_CODE | LAB_CODE | ANALYTE_TYPE | DILUTION_FACTOR | SAMPLER_COMMENTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 767202 | SRS | 19 | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FOB 15D | MONITORING WELL | NaN | F_PLUME_ASSESS; H_AUX;MW_AUX | NaN | ... | ft | NaN | NaN | UE | NaN | NaN | NaN | NaN | NaN | PROG_NAME=W |
1 | 767203 | SRS | 19 | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FSB133D | MONITORING WELL | NaN | F_PLUME_ASSESS; H_AUX;MW_AUX | NaN | ... | ft | NaN | NaN | UE | NaN | NaN | NaN | NaN | NaN | PROG_NAME=W |
2 | 767204 | SRS | 19 | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FSB134D | MONITORING WELL | NaN | F_PLUME_ASSESS; H_AUX;MW_AUX | NaN | ... | ft | NaN | NaN | UE | NaN | NaN | NaN | NaN | NaN | PROG_NAME=W |
3 | 767205 | SRS | 19 | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FSB136D | MONITORING WELL | NaN | F_PLUME_ASSESS; H_AUX;MW_AUX | NaN | ... | ft | NaN | NaN | UE | NaN | NaN | NaN | NaN | NaN | PROG_NAME=W |
4 | 767206 | SRS | 19 | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FSB122D | MONITORING WELL | NaN | F_PLUME;H_AUX;MW_AUX | NaN | ... | ft | NaN | NaN | UE | NaN | NaN | NaN | NaN | NaN | PROG_NAME=W |
5 rows × 37 columns
The same can be done for the construction data by running the get_Construction_Data()
function.
[9]:
pylenm_df.get_Construction_Data().head()
[9]:
FACILITY_ID | SITE_GROUP | SITE_NAME | STATION_TYPE | WELL USE | AQUIFER | WELL_USE | LATITUDE | LONGITUDE | EASTING | ... | SZ_BOT(FT MSL) | GROUND_ELEVATION | REFERENCE_ELEVATION_CODE | REFERENCE_ELEVATION | TOTAL_DEPTH | CONSTR_OBJ_DIAMETER | CONSTR_OBJ_MATERIAL | PUMP_TYPE | COMPLETION_DATE | DATE_SEALED | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
STATION_ID | |||||||||||||||||||||
FAI001A | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | MONITORING WELL | Auxiliary Observation | UAZ_UTRAU | ACTIVE ECO-SENSITIVE | 33.273872 | -81.622904 | 441989.564 | ... | 231.30 | 250.1 | C | 252.63 | 19.10 | 2.0 | PVC | NONE | 2016-03-22 | NaT |
FAI001B | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | MONITORING WELL | Auxiliary Observation | UAZ_UTRAU | ACTIVE ECO-SENSITIVE | 33.273873 | -81.622891 | 441990.781 | ... | 240.60 | 250.2 | C | 252.73 | 9.90 | 2.0 | PVC | NONE | 2016-03-22 | NaT |
FAI001C | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | MONITORING WELL | Auxiliary Observation | UAZ_UTRAU | ACTIVE ECO-SENSITIVE | 33.273874 | -81.622895 | 441990.432 | ... | 242.68 | 250.2 | C | 252.74 | 7.82 | 2.0 | PVC | NONE | 2016-03-22 | NaT |
FAI001D | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | MONITORING WELL | Auxiliary Observation | UAZ_UTRAU | ACTIVE ECO-SENSITIVE | 33.273874 | -81.622901 | 441989.928 | ... | 246.75 | 250.1 | C | 252.56 | 3.65 | 2.0 | PVC | NONE | 2016-03-22 | NaT |
FAI002A | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | MONITORING WELL | Auxiliary Observation | UAZ_UTRAU | ACTIVE ECO-SENSITIVE | 33.263961 | -81.685462 | 436156.287 | ... | 165.88 | 185.1 | C | 187.58 | 3.65 | 2.0 | PVC | NONE | 2016-03-29 | NaT |
5 rows × 22 columns
At any point we can also reset the current working dataset by running setData()
for the concentration data and setConstructionData()
for the construction data.
[10]:
pylenm_df.setData(concentration_data)
pylenm_df.setConstructionData(construction_data)
Successfully imported the data!
Successfully imported the construction data!
We’ll set the inplace
parameter to True
to make it the default working dataset.
[11]:
pylenm_df.simplify_data(inplace=True)
[11]:
COLLECTION_DATE | STATION_ID | ANALYTE_NAME | RESULT | RESULT_UNITS | |
---|---|---|---|---|---|
0 | 1990-01-01 | FSB 77 | PH | 3.30 | pH |
1 | 1990-01-01 | FSB105C | DEPTH_TO_WATER | 80.49 | ft |
2 | 1990-01-01 | FSB107D | SPECIFIC CONDUCTANCE | 289.00 | uS/cm |
3 | 1990-01-01 | FSB101A | DEPTH_TO_WATER | 134.49 | ft |
4 | 1990-01-01 | FSB101A | SPECIFIC CONDUCTANCE | 161.00 | uS/cm |
... | ... | ... | ... | ... | ... |
86206 | 2015-09-24 | FBI 15D | DEPTH_TO_WATER | 15.40 | ft |
86207 | 2015-09-24 | FSB127D | DEPTH_TO_WATER | 18.20 | ft |
86208 | 2015-09-24 | FSB113C | DEPTH_TO_WATER | 23.20 | ft |
86209 | 2015-09-24 | FOB 13D | DEPTH_TO_WATER | 21.50 | ft |
86210 | 2015-09-24 | FSB 79 | DEPTH_TO_WATER | 19.76 | ft |
86211 rows × 5 columns
Basic Functions¶
Let’s generate some useful reports about the data by calling get_analyte_details
and get_data_summary
[12]:
pylenm_df.get_analyte_details(analyte_name='DEPTH_TO_WATER')
[12]:
Start Date | End Date | Date Range (days) | Unique samples | |
---|---|---|---|---|
Well Name | ||||
FSB 94D | 1990-01-01 | 1990-10-08 | 280 | 4 |
FSB 95D | 1990-01-01 | 1990-10-08 | 280 | 4 |
FSB 77 | 1990-01-01 | 2006-10-16 | 6132 | 122 |
FSB111C | 1990-01-01 | 2006-10-17 | 6133 | 115 |
FSB105C | 1990-01-01 | 2006-10-19 | 6135 | 127 |
... | ... | ... | ... | ... |
FSB146D | 2015-04-29 | 2015-09-09 | 133 | 9 |
FSB145D | 2015-04-30 | 2015-09-09 | 132 | 9 |
FSB143D | 2015-05-04 | 2015-09-09 | 128 | 9 |
FSB144D | 2015-05-04 | 2015-09-09 | 128 | 9 |
FSB142D | 2015-05-05 | 2015-09-09 | 127 | 9 |
155 rows × 4 columns
Most functions have a filtering feature which uses the filter_by_column
fucntion to reduced the search space. We’ll run get_analyte_details
again but this time we’ll view the results for the ‘Lower Aquifer’ wells only.
[13]:
pylenm_df.get_analyte_details('TRITIUM', filter=True, col='AQUIFER', equals=['LAZ_UTRAU'])
[13]:
Start Date | End Date | Date Range (days) | Unique samples | |
---|---|---|---|---|
Well Name | ||||
FSB 93C | 1990-01-01 | 2015-07-28 | 9339 | 103 |
FSB107C | 1990-01-01 | 2015-08-04 | 9346 | 103 |
FSB 94C | 1990-01-02 | 2015-07-28 | 9338 | 109 |
FSB 97C | 1990-01-03 | 2015-07-29 | 9338 | 103 |
FSB 98C | 1990-01-03 | 2015-07-29 | 9338 | 102 |
FSB 87C | 1990-01-06 | 2015-07-27 | 9333 | 103 |
FSB102C | 1990-01-06 | 2015-07-28 | 9334 | 102 |
FSB 90C | 1990-01-07 | 2015-07-28 | 9333 | 103 |
FSB 78C | 1990-01-07 | 2015-08-03 | 9339 | 109 |
FSB 91C | 1990-01-07 | 2015-08-03 | 9339 | 98 |
FSB 76C | 1990-01-09 | 2015-07-27 | 9330 | 114 |
FSB 99C | 1990-01-09 | 2015-07-29 | 9332 | 103 |
FSB104C | 1990-01-09 | 2015-07-30 | 9333 | 103 |
FSB 79C | 1990-01-09 | 2015-08-03 | 9337 | 102 |
FSB 88C | 1990-01-16 | 2015-07-28 | 9324 | 103 |
FSB 89C | 1990-01-16 | 2015-07-28 | 9324 | 103 |
FSB114C | 1991-02-05 | 2015-07-27 | 8938 | 99 |
FSB122C | 1991-02-06 | 2015-07-30 | 8940 | 97 |
FSB120C | 1991-02-06 | 2015-08-04 | 8945 | 99 |
FSB112C | 1991-02-06 | 2015-08-10 | 8951 | 97 |
FSB121C | 1991-02-07 | 2015-02-04 | 8763 | 73 |
FSB123C | 1991-02-10 | 2015-01-21 | 8746 | 73 |
FSB116C | 1991-02-10 | 2015-01-22 | 8747 | 60 |
FSB115C | 1991-02-11 | 2015-01-21 | 8745 | 60 |
FSB113C | 1991-02-11 | 2015-07-27 | 8932 | 78 |
FSB 95CR | 1991-02-24 | 2015-07-29 | 8921 | 105 |
FSB 92C | 1993-11-07 | 2015-08-10 | 7946 | 87 |
FOB 2C | 2004-05-05 | 2004-05-05 | 0 | 1 |
FOB 14C | 2004-05-05 | 2004-05-05 | 0 | 1 |
get_data_summary
is another useful summarization function.
[14]:
pylenm_df.get_data_summary(analytes=['TRITIUM','IODINE-129','SPECIFIC CONDUCTANCE',
'PH','URANIUM-238', 'DEPTH_TO_WATER'], sort_by='wells')
[14]:
Start Date | End Date | Date Range (days) | # unique wells | # samples | Unit | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Analyte Name | |||||||||||||
TRITIUM | 1990-01-01 | 2015-09-23 | 9396 days | 160 | 13476 | pCi/mL | 1248.316406 | 3261.565925 | -0.628 | 25.300 | 193.00 | 934.00 | 44800.0 |
PH | 1990-01-01 | 2015-09-23 | 9396 days | 158 | 18517 | pH | 5.354886 | 1.529332 | 0.000 | 4.100 | 5.20 | 6.40 | 12.7 |
SPECIFIC CONDUCTANCE | 1990-01-01 | 2015-09-23 | 9396 days | 156 | 18276 | uS/cm | 425.754861 | 555.068915 | 0.000 | 87.000 | 213.00 | 544.00 | 10700.0 |
DEPTH_TO_WATER | 1990-01-01 | 2015-09-24 | 9397 days | 155 | 20091 | ft | 46.814224 | 38.137888 | 0.500 | 16.450 | 32.17 | 72.90 | 177.2 |
IODINE-129 | 1990-01-06 | 2015-09-23 | 9391 days | 150 | 5405 | pCi/L | 41.624339 | 92.511351 | -356.900 | 0.996 | 9.37 | 45.10 | 1620.0 |
URANIUM-238 | 1990-01-06 | 2015-09-23 | 9391 days | 144 | 5319 | pCi/L | 66.398114 | 169.240747 | -2.650 | 0.062 | 0.64 | 43.75 | 1810.0 |
Here is the same function but with the same filtering we applied above (‘Lower Aquifer’ wells).
[15]:
pylenm_df.get_data_summary(analytes=['TRITIUM','IODINE-129','SPECIFIC CONDUCTANCE',
'PH','URANIUM-238', 'DEPTH_TO_WATER'],
filter=True, col='AQUIFER', equals=['LAZ_UTRAU'], sort_by='wells')
[15]:
Start Date | End Date | Date Range (days) | # unique wells | # samples | Unit | mean | std | min | 25% | 50% | 75% | max | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Analyte Name | |||||||||||||
TRITIUM | 1990-01-01 | 2015-08-10 | 9352 days | 29 | 2961 | pCi/mL | 1904.185517 | 3350.574654 | -0.387 | 69.7000 | 638.00 | 1630.000 | 40000.00 |
SPECIFIC CONDUCTANCE | 1990-01-01 | 2015-08-10 | 9352 days | 29 | 4350 | uS/cm | 547.272770 | 668.286641 | 0.000 | 72.0000 | 300.00 | 660.000 | 10600.00 |
PH | 1990-01-01 | 2015-08-10 | 9352 days | 29 | 4402 | pH | 5.208301 | 1.135237 | 0.000 | 4.4000 | 5.30 | 5.830 | 12.60 |
DEPTH_TO_WATER | 1990-01-01 | 2015-09-24 | 9397 days | 29 | 4138 | ft | 54.328794 | 25.240702 | 4.750 | 25.5150 | 67.45 | 75.680 | 120.13 |
IODINE-129 | 1990-01-06 | 2015-08-10 | 9347 days | 27 | 1246 | pCi/L | 29.382026 | 40.177638 | -16.150 | 2.0000 | 11.00 | 36.075 | 233.00 |
URANIUM-238 | 1990-01-06 | 2015-08-10 | 9347 days | 27 | 1277 | pCi/L | 60.123929 | 130.590327 | -0.408 | 0.0527 | 0.48 | 46.000 | 1030.00 |
[16]:
pylenm_df.filter_by_column(construction_data, col='AQUIFER', equals=['LAZ_UTRAU']).head()
[16]:
FACILITY_ID | SITE_GROUP | SITE_NAME | STATION_ID | STATION_TYPE | WELL USE | AQUIFER | WELL_USE | LATITUDE | LONGITUDE | ... | SZ_BOT(FT MSL) | GROUND_ELEVATION | REFERENCE_ELEVATION_CODE | REFERENCE_ELEVATION | TOTAL_DEPTH | CONSTR_OBJ_DIAMETER | CONSTR_OBJ_MATERIAL | PUMP_TYPE | COMPLETION_DATE | DATE_SEALED | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
44 | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FBI 14C | MONITORING WELL | Auxiliary Observation | LAZ_UTRAU | ACTIVE | 33.271667 | -81.677454 | ... | 148.78 | 222.80 | S | 225.58 | 76.52 | 2.0 | PVC | VARIABLE SPEED PUMP | 2003-12-30 | NaT |
47 | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FBI 16C | MONITORING WELL | Auxiliary Observation | LAZ_UTRAU | ACTIVE | 33.271394 | -81.678173 | ... | 145.00 | 225.00 | S | 227.92 | 81.60 | 2.0 | PVC | VARIABLE SPEED PUMP | 2003-12-19 | NaT |
144 | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FOB 2C | MONITORING WELL | Auxiliary Observation | LAZ_UTRAU | ACTIVE | 33.270220 | -81.680036 | ... | 146.56 | 226.58 | S | 229.57 | 82.52 | 2.0 | PVC | VARIABLE SPEED PUMP | 2003-12-31 | NaT |
148 | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FOB 5C | MONITORING WELL | Auxiliary Observation | LAZ_UTRAU | ACTIVE | 33.271922 | -81.680770 | ... | 129.32 | 256.34 | C | 258.84 | 132.00 | 2.0 | PVC | NaN | 1996-10-11 | NaT |
150 | SRS | GSA | F & H-AREA HAZARDOUS WASTE MANAGEMENT FACILITI... | FOB 7C | MONITORING WELL | Auxiliary Observation | LAZ_UTRAU | ACTIVE | 33.275991 | -81.682290 | ... | 148.87 | 295.85 | C | 298.35 | 152.00 | 2.0 | PVC | NaN | 1996-10-08 | NaT |
5 rows × 23 columns
[17]:
pylenm_df.get_well_analytes(well_name='FSB102C')
FSB102C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
[18]:
pylenm_df.get_well_analytes(filter=True, col='AQUIFER', equals=['LAZ_UTRAU'])
FSB107C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 93C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 94C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 97C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 98C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB102C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 87C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 90C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 78C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 91C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB104C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 99C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 76C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 79C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 88C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 89C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB114C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB122C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB120C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB112C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB121C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB116C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB123C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB115C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB113C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 95CR
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FSB 92C
['DEPTH_TO_WATER', 'IODINE-129', 'PH', 'SPECIFIC CONDUCTANCE', 'STRONTIUM-90', 'TRITIUM', 'URANIUM-238']
FOB 14C
['DEPTH_TO_WATER', 'PH', 'SPECIFIC CONDUCTANCE', 'TRITIUM']
FOB 2C
['DEPTH_TO_WATER', 'PH', 'SPECIFIC CONDUCTANCE', 'TRITIUM']
[19]:
pylenm_df.plot_all_time_series_simple(start_date='2000-1-1', min_days=500)
[20]:
ur = pylenm_df.plot_all_time_series_simple(analyte_name='URANIUM-238', x_max_lim=150, return_data=True)
ur
[20]:
STATION_ID | START_DATE | END_DATE | RANGE | |
---|---|---|---|---|
0 | FSB 92D | 1990-01-06 | 2015-08-03 | 9340 |
1 | FSB 91D | 1990-01-06 | 2015-08-03 | 9340 |
2 | FSB 78C | 1990-01-07 | 2015-08-03 | 9339 |
3 | FSB 79C | 1990-01-09 | 2015-08-03 | 9337 |
4 | FSB 79 | 1990-01-09 | 2015-08-03 | 9337 |
... | ... | ... | ... | ... |
139 | FSB139D | 2012-06-05 | 2013-06-18 | 378 |
140 | FSB140D | 2012-06-05 | 2013-06-18 | 378 |
141 | FSP 2C | 2002-06-25 | 2002-09-17 | 84 |
142 | FOB 1D | 2004-04-23 | 2004-06-22 | 60 |
143 | FOB 14D | 2004-06-23 | 2004-08-19 | 57 |
144 rows × 4 columns
[21]:
# Added filtering of 'Upper Aquifer' Wells
pylenm_df.plot_all_time_series_simple(analyte_name='IODINE-129',min_days=3*365, x_max_lim=110, filter=True, col='AQUIFER', equals=['UAZ_UTRAU'], start_date='2005-1-1')
[22]:
WT_UA = pylenm_df.plot_all_time_series(analyte_name='DEPTH_TO_WATER', filter=True, col='AQUIFER', equals=['UAZ_UTRAU'],
x_label_size=15, marker_size=60, return_data=True, cbar_max=100, y_2nd_label='DEPTH_TO_WATER (ft)')
[23]:
TR_UP = pylenm_df.plot_all_time_series(analyte_name='TRITIUM', filter=True, col='AQUIFER', equals=['UAZ_UTRAU'],
x_label_size=15, marker_size=60, log_transform=True, return_data=True, drop_cols=['FSB131D'],
cbar_min=-0.5, y_2nd_label='Log Tritium Concentration (pCi/mL)')
[24]:
pylenm_df.plot_data(well_name= 'FSB 95DR', analyte_name='TRITIUM', y_label="I CAN CHANGE")
[25]:
pylenm_df.plot_data(well_name= 'FSB 98C', analyte_name='SPECIFIC CONDUCTANCE', year_interval=5)
When using the interpolation functionality, a resampling frequency can be specifed. Correct inputs for the the frequency can be found in the function description.
The interpolate_wells_by_analyte
function is great as it interpolates and combines all of the wells with the same analyte. There is no need to specify a start and end date as the algorithm will find the optimum start and end date.
[26]:
wt = pylenm_df.interpolate_wells_by_analyte(analyte='DEPTH_TO_WATER',
frequency= 'D',
rm_outliers=True,
z_threshold=2.5)
wt
[26]:
FBI 14D | FSB115D | FSB113A | FSB113C | FSB113D | FSB114A | FSB114C | FSB114D | FSB115C | FSB116C | ... | FSB 90C | FSB 94C | FSB 90D | FSB 91C | FSB 91D | FSB 92C | FSB 92D | FSB 93C | FSB 93D | FSP249B | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2003-11-18 | 22.280040 | 17.512000 | 65.431169 | 22.193247 | 17.418831 | 97.627037 | 40.203704 | 36.099259 | 23.501053 | 13.566211 | ... | 68.502247 | 74.685714 | 64.122414 | 69.190000 | 66.047753 | 67.300602 | 64.672727 | 69.222667 | 67.020779 | 3.550000 |
2003-11-19 | 22.268978 | 17.518000 | 65.430519 | 22.189221 | 17.419481 | 97.625432 | 40.209877 | 36.118025 | 23.498947 | 13.563789 | ... | 68.523034 | 74.696429 | 64.156322 | 69.203750 | 66.076966 | 67.312048 | 64.692045 | 69.220111 | 67.037013 | 3.542857 |
2003-11-20 | 22.257916 | 17.524000 | 65.429870 | 22.185195 | 17.420130 | 97.623827 | 40.216049 | 36.136790 | 23.496842 | 13.561368 | ... | 68.543820 | 74.707143 | 64.190230 | 69.217500 | 66.106180 | 67.323494 | 64.711364 | 69.217556 | 67.053247 | 3.535714 |
2003-11-21 | 22.246854 | 17.530000 | 65.429221 | 22.181169 | 17.420779 | 97.622222 | 40.222222 | 36.155556 | 23.494737 | 13.558947 | ... | 68.564607 | 74.717857 | 64.224138 | 69.231250 | 66.135393 | 67.334940 | 64.730682 | 69.215000 | 67.069481 | 3.528571 |
2003-11-22 | 22.235792 | 17.536000 | 65.428571 | 22.177143 | 17.421429 | 97.620617 | 40.228395 | 36.174321 | 23.492632 | 13.556526 | ... | 68.585393 | 74.728571 | 64.258046 | 69.245000 | 66.164607 | 67.346386 | 64.750000 | 69.212444 | 67.085714 | 3.521429 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2015-09-20 | 17.370588 | 17.500000 | 67.000000 | 23.191207 | 18.200000 | 98.100000 | 41.600000 | 37.400000 | 23.900000 | 14.000000 | ... | 69.600000 | 73.570000 | 65.440000 | 70.600000 | 66.680000 | 67.300000 | 64.400000 | 68.700000 | 65.320000 | 3.340000 |
2015-09-21 | 17.380392 | 17.500000 | 67.000000 | 23.194138 | 18.200000 | 98.100000 | 41.600000 | 37.400000 | 23.900000 | 14.000000 | ... | 69.600000 | 73.570000 | 65.440000 | 70.600000 | 66.680000 | 67.300000 | 64.400000 | 68.700000 | 65.320000 | 3.340000 |
2015-09-22 | 17.390196 | 17.500000 | 67.000000 | 23.197069 | 18.200000 | 98.100000 | 41.600000 | 37.400000 | 23.900000 | 14.000000 | ... | 69.600000 | 73.570000 | 65.440000 | 70.600000 | 66.680000 | 67.300000 | 64.400000 | 68.700000 | 65.320000 | 3.340000 |
2015-09-23 | 21.100414 | 17.945698 | 66.153712 | 22.145371 | 17.243528 | 97.637192 | 41.290187 | 37.941873 | 23.780778 | 13.833220 | ... | 69.695045 | 75.251497 | 66.101937 | 70.636603 | 67.665691 | 68.499770 | 66.081719 | 69.538644 | 67.544049 | 3.585014 |
2015-09-24 | 17.400000 | 17.500000 | 67.000000 | 23.200000 | 18.200000 | 98.100000 | 41.600000 | 37.400000 | 23.900000 | 14.000000 | ... | 69.600000 | 73.570000 | 65.440000 | 70.600000 | 66.680000 | 67.300000 | 64.400000 | 68.700000 | 65.320000 | 3.340000 |
4329 rows × 155 columns
Let’s look at the difference between removing the outliers and keeping them.
[27]:
freq = '4M'
analyte = 'DEPTH_TO_WATER'
wt_with_outliers = pylenm_df.interpolate_wells_by_analyte(analyte = analyte,
frequency = freq,
rm_outliers=False)
wt_without_outliers = pylenm_df.interpolate_wells_by_analyte(analyte = analyte,
frequency = freq,
rm_outliers=True,
z_threshold=1.5)
wt_with_outliers['FSB131D'].plot()
wt_without_outliers['FSB131D'].plot()
[27]:
<AxesSubplot:>
[28]:
wt_with_outliers.plot(legend=None)
wt_without_outliers.plot(legend=None)
[28]:
<AxesSubplot:>
The interpolate_well_data
function is useful if you are only looking at one well at a time. But has limited functionality as it can only take in ONE well as a parameter.
[29]:
pylenm_df.interpolate_well_data(well_name='FSB 95DR',
analytes=['DEPTH_TO_WATER', 'URANIUM-238'],
frequency='2M')
[29]:
DEPTH_TO_WATER | URANIUM-238 | |
---|---|---|
COLLECTION_DATE | ||
1993-02-28 | 73.030 | 1620.000000 |
1993-04-30 | 72.120 | 1475.000000 |
1993-06-30 | 71.325 | 1330.000000 |
1993-08-31 | 71.940 | 1100.000000 |
1993-10-31 | 72.540 | 1155.000000 |
... | ... | ... |
2014-12-31 | 75.375 | 239.666667 |
2015-02-28 | 75.610 | 232.000000 |
2015-04-30 | 75.000 | 224.333333 |
2015-06-30 | 74.615 | 216.666667 |
2015-08-31 | 74.230 | 209.000000 |
136 rows × 2 columns