Difference between revisions of "Processing Data Workflow"
Jump to navigation
Jump to search
(Created page with "# Visit Site & Download Data with Toughbook ## data should go in ~/Dropbox/logger_data/$YEAR/$SITE/ ## general naming convention for data files is: $SITE_$TABLE-$YEAR_$MONTH_D...") |
|||
Line 1: | Line 1: | ||
# Visit Site & Download Data with Toughbook | # Visit Site & Download Data with Toughbook | ||
## data should go in ~/Dropbox/logger_data/$YEAR/$SITE/ | ## data should go in ~/Dropbox/logger_data/$YEAR/$SITE/ | ||
− | ## general naming convention for data files is: $SITE_$TABLE-$YEAR_$ | + | ## general naming convention for data files is: $SITE_$TABLE-$YEAR_$MONTH_$DAY.dat |
− | # Back in town, put toughbook online so | + | # Back in camp/town/place with internet access, put toughbook online so Dropbox can mirror data to other computers |
− | # | + | # Next, out of the field and back in the office: copy this data from dropbox to /var/site/$AREA/$SITE/raw or similar (e.g. (/var/site/utq/utq_A/raw/) |
− | # Most processing scripts are located in /var/site/bin/ with name like process_$SITE.sh | + | # Most processing scripts are located in /var/site/bin/ with a name like process_$SITE.sh (e.g. process_teller_bottom.sh) |
# to this script add lines in each appropriate section so that datapro can process this new data | # to this script add lines in each appropriate section so that datapro can process this new data | ||
# In the initial cut here (I'm just starting to do this 10/2018) I have two bash functions in the script. They're related. 1) is delete all the processed data in /var/site/$AREA/$SITE/outputs and 2) is apply the manual QA. In the bash script, I comment the calls to do those two things first run so that the processing is quicker. So, edit the bash file to comment the two function calls, too. | # In the initial cut here (I'm just starting to do this 10/2018) I have two bash functions in the script. They're related. 1) is delete all the processed data in /var/site/$AREA/$SITE/outputs and 2) is apply the manual QA. In the bash script, I comment the calls to do those two things first run so that the processing is quicker. So, edit the bash file to comment the two function calls, too. | ||
Line 12: | Line 12: | ||
# there should be a line in the process_$SITE.sh subroutine for manual fixes for this file (most don't have this line until there is manually correcting to do). | # there should be a line in the process_$SITE.sh subroutine for manual fixes for this file (most don't have this line until there is manually correcting to do). | ||
# After manual corrections are complete and script is updated, re-run /var/site/bin/process_$SITE.sh with everything uncommented so that final dataset is created from scratch and manual edits are applied. | # After manual corrections are complete and script is updated, re-run /var/site/bin/process_$SITE.sh with everything uncommented so that final dataset is created from scratch and manual edits are applied. | ||
− | # upload the final product from local computer to main server / or data archive etc | + | # upload the final product from local computer to main server / or data archive etc (ngeedata / ocotal / project data portal) |
Latest revision as of 12:06, 17 October 2018
- Visit Site & Download Data with Toughbook
- data should go in ~/Dropbox/logger_data/$YEAR/$SITE/
- general naming convention for data files is: $SITE_$TABLE-$YEAR_$MONTH_$DAY.dat
- Back in camp/town/place with internet access, put toughbook online so Dropbox can mirror data to other computers
- Next, out of the field and back in the office: copy this data from dropbox to /var/site/$AREA/$SITE/raw or similar (e.g. (/var/site/utq/utq_A/raw/)
- Most processing scripts are located in /var/site/bin/ with a name like process_$SITE.sh (e.g. process_teller_bottom.sh)
- to this script add lines in each appropriate section so that datapro can process this new data
- In the initial cut here (I'm just starting to do this 10/2018) I have two bash functions in the script. They're related. 1) is delete all the processed data in /var/site/$AREA/$SITE/outputs and 2) is apply the manual QA. In the bash script, I comment the calls to do those two things first run so that the processing is quicker. So, edit the bash file to comment the two function calls, too.
- next, run the bash script.
- once done the data will appear here: /var/site/$AREA/$SITE/outputs/. I have a web page that can be used for visualizations. /var/site/$AREA/$SITE/index.html after the automated processing has fixed everything then I review this page to catch things it misses (for instance, animal damage knocking a radiometer off level, other oddities where the data appears right but isn't quite.)
- Any manual corrections go in an excel spreadsheet. /var/site/$AREA/$SITE/qc/$SENSORNAME_fixes.xlsx
- there should be a line in the process_$SITE.sh subroutine for manual fixes for this file (most don't have this line until there is manually correcting to do).
- After manual corrections are complete and script is updated, re-run /var/site/bin/process_$SITE.sh with everything uncommented so that final dataset is created from scratch and manual edits are applied.
- upload the final product from local computer to main server / or data archive etc (ngeedata / ocotal / project data portal)