In this post I will go over installing Apache Spark and initial interactions from within R. I am currently using Linux/Ubuntu 20.04 so the instruction are tailored to my environment. The process should be similar to other Linux distributions as well as Mac and Microsoft environments. Getting Apache Spark There are a couple routes to…
Category: R
PostgreSQL Table Creation and Bulk Insertion
As part of converting my Criminal Analysis Data Project code from R to Julia, I thought I would create a series of small posts detailing components of the translation process of data operations in smaller bits. This particular post will show a solution for how to take tabular data from a CSV and load it…
Julia’s Gadfly for R ggplot2 Users
Over the past week I have been reading the documentation and playing with Julia’s Gadfly package. I thought it would be helpful to fellow R users coming from the world of ggplot2 to put together a quick reference guide to show the translation from one to the other. The coding and style for creating data…
Criminal Analysis: Data Storage (Part 3)
In this post, I will demonstrate loading my criminal activity data into ElasticSearch sot it can be explored, analyzed and visualized in Kibana. For instructions on installing and configuring the Elastic (formerly ELK) Stack, see my previous post. Although this post will specially reference the crime data from my PostgreSQL database, I will include additional…
Converting R scripts to Julia (Part 2)
As part of my Getting COVID-19 Data posts in R, Python and Julia, I will now advance to part two of the conversion process. As we saw in Part 1 of this post series, we duplicated the R scripts into the language specific script folder and changed the file extensions to the appropriate language. In…
Converting R scripts to Python (Part 2)
As part of my Getting COVID-19 Data posts in R, Python and Julia, I will now advance to part two of the conversion process. As we saw in Part 1 of this post series, we duplicated the R scripts into the language specific script folder and changed the file extensions to the appropriate language. In…
Getting COVID-19 Data (Julia)
In this post, I will cover getting open source COVID-19 data for the United States using Julia. The data pipeline demonstrated here is very simple example and could easily be adapted into a Prefect, Apache NiFi or Apache AirFlow ETL process. Data Search Performing a quick search on DuckDuckGo I got The COVID Tracking Project,…
Getting COVID-19 Data (Python)
In this post, I will cover getting open source COVID-19 data for the United States using Python. The data pipeline demonstrated here is very simple example and could easily be adapted into a Prefect, Apache NiFi or Apache AirFlow ETL process. Data Search Performing a quick search on DuckDuckGo I got The COVID Tracking Project,…
Converting R scripts to Julia (Part 1)
UPDATE (8-JAN-2021): I have decided to demonstrate the conversion process using my Getting COVID-19 Data post and script instead of my Criminal Analysis project for right now. I will still be working through the conversion process for those scripts as well, but for now, I will demonstrate the conversion and translation process on a shorter…
Getting COVID-19 Data (R)
In this post, I will cover getting open source COVID-19 data for the United States using R. The data pipeline demonstrated here is very simple example and could easily be adapted into a Prefect, Apache NiFi or Apache AirFlow ETL process. Data Search Performing a quick search on DuckDuckGo I got The COVID Tracking Project,…