Define the Scope of the Project
The first step in the entire project is planning. Without formulating some sort of plan, your project will potentially lack the focus and direction it needs. In all likelihood, without a plan you tend to create more work for yourself as a result of being disorganized.
Defining the scope of the project allows you to hone in on what you are going to accomplish when the project is complete. Begin with the end in mind. Having realistic expectations is important. Some projects can sound really grand and ambitious, but we are governed by resources and time. What you want to do may rely heavily on what data is a available. With that said a portion of the planning process involves determining if data exists and if it contains the information you need to accomplish your goals.
Resources and time are real constraints that will need to temper your expectations more often than not. Constraints are important to identify in the planning stage because it can determine the scope of the project and what it’s not going to cover. Do you have the resources and time to process and analyze your desired data sources? or just enough to get through 1? What skills do you need to have to execute the project from start to end? This can be fleshed out during this stage by mapping out the entire project.
Ask yourself: Do you have the resources and time to process and analyze 10 data sources or just enough to get through 1? The answer to this can help you map out the entire project.
Mapping your project
At a high-level, projects have similar processes when you distill everything down. Using a flow diagram may help to identify what you will need to go from data search to analysis, then onto a decision. Mapping out your workflow and project scope will help you to focus your efforts, provide tasks to complete to ensure your follow-on stages can be accomplished.
Example of mapping a project
In the following example I used XMind to scope out ideas for a project. I decided to conduct an analysis of criminal activity data. I was curious as to when various crimes occur, where they might occur more often, as well as what might influence criminal activity. The results of this scope could help identify staffing levels, locations to monitor, locations that might need more resources, services or community support. It has applicability for local law enforcement, the general population (resident and visitors) as well as local elected officials or volunteer community services.
During the planning, I identify multiple options when it came to cities to analyze. Not every city will have available data, so its a good idea to write down some alternatives for back-ups. Similarly data may not be available for the date ranges you’re interested in. Stay flexible, but also consider redefining your parameters if you don’t have enough resources or data initially. You might be able to find more data or come back later with more time and resources to do a larger project. The map below does not cover everything but at least gives some idea about how to drill-down into each of the parameters or ask questions that you might want to answer.
As I researched what data was available, I settled on Washington D.C. The data seemed relatively straight forward, with lots of fields that could be analyzed.
The key is to define reasonable questions and formulate the problem statement, or intended purpose of the project.
Key considerations to map out:
- What resources will be needed to process the data, store the data, analyze and visualize the data?
- What date ranges, if applicable, will the project cover?
- Where are you going to get the data?
- Are there key questions you want to know?
- Is this a descriptive or predictive project?
- Do you have any key performance indicators?
- What defines success?
High-Level Workflow
Creating a high-level workflow helps to understand what needs to happen to get data from the search stage to the production of analysis stage. A general high-level workflow will likely be similar across different projects. The specifics about what happens within each stage will likely be project dependent.
Within the same high-level workflow, you can create branches and separate designs to capture each workflow within each stage of the high-level workflow. As your project develops you may need to adjust or modify the workflows to match what is actually needed that may not have been know at the beginning. There are lots of tools out there to help diagram this out, and some that can even be used for controlling your workflow by triggering scripts and macros.
Software to help with workflow:
- Pentaho
- XMind
- joget
- FOXopen
- ProcessMaker
- MS PowerPoint
- MS Visio
This list is not an exhaustive list of what is available but merely results I found with a simple internet query or previously knew about. You should not get bogged down by a specific software for this step, but rather you should use what you are comfortable with and able to design what you need to do for your project. If you have time, explore some tools and learn how to use them when you are not under a time crunch. Each software has its pros and cons, so at the end of the day, it has to work for you!
Example Workflows
Over the year, I have developed some simple workflows that seem to capture what is happening. Clean and simple and doesn’t require much time to digest. Your workflow could be as basic as this, which was done in MS Power Point:
This captures the key stages in the overall process. The next graphic depicts a variation of my basic workflow, but captures a little more fidelity within each stage. In reality, your workflow will have more back and forth events between processes. Don’t be surprised if you even have to go back a few stages as you work a project. As a project matures, this should stabilize.
Associated Posts
- Criminal Analysis: Data Storage (Part 3)
- Getting COVID-19 Data (Julia)
- Getting COVID-19 Data (Python)
- Getting COVID-19 Data (R)
- Criminal Analysis: Data Exploration (part 2b)
- Criminal Analysis: Data Exploration (part 2a)
- Criminal Analysis: Data Exploration (part 1)
- Criminal Analysis: Data Exploration
- Derive a Star Schema By Example
- Criminal Analysis: Data Search (part 4)