This is a quick update on a small project I am working on. First version should be released within the next few days I hope.
I am currently working on setting up sports data sources that can be loaded in Tableau software. I had the opportunity to gather data about NBA games and players, and I wish to make this dataset available to those who would like to do some visualisation with Tableau.
Tableau software offers the possibility to load data from a Web data connector. I am just building this connector whose purpose will be to query data that I have already extracted and structured.
This tool will be online shortly, and 100% free. If you have a Tableau licence, I strongly encourage you to give it a try and build dashboards around this data.
How this is working
The project is split into a few steps :
- Develop an extractor (with Scala) that will be able to connect to the Sports data API and fetch the raw data
- Process the data (with Spark) to structure the data (JSON format) for many possible visualisations: season games, players, action games, etc.
- Load the data into a MongoDB cluster
- Develop the Tableau connector (with NodeJS) that will query MongoDB, and expose the data to Tableau software
The project is designed around an ETL (extract, transform, load) pattern, and it will be deployed on the cloud. CI/CD will be handled manually, with Docker, Rundeck and AWS EC2 instances.
As for now, every part of the project is done. I just need to update the connector’s data model, and rebuild in production.
In a future blog post, I will go deeper into the architecture of this project, with architecture schemas, and more explanations about the technical choices that were made.
For the first release of the project, I will give access to 1 NBA data source so that Tableau users can build their own dashboards. After this, I will rapidly release a new version of the project that will give access to other NBA data sources, in order to cover other facets of the main dataset I’ve got on my MongoDB cluster.
Feel free to give me feedbacks !