Makhlouf Aroua
5 min readNov 25, 2020

--

The Battle of Neighborhoods: Coursera Capstone Project:

Opening a new Coffee Shop in Dubai City, UAE

1. Discussion and Background of the Business Problem:

Coffee Shop

Introduction Section

This final project explores the best locations for Coffee Shop throughout the Dubai city. Dubai Located in the eastern part of the Arabian Peninsula on the coast of the Persian Gulf, Dubai aims to be the business hub of Western Asia. It is also a major global transport hub for passengers and cargo. Oil revenue helped accelerate the development of the city, which was already a major mercantile hub. Dubai’s oil output made up 2.1 percent of the Persian Gulf emirate’s economy in 2008. A center for regional and international trade since the early 20th century, Dubai’s economy relies on revenues from trade, tourism, aviation, real estate, and financial services. According to government data, the population of Dubai is estimated at 3,400,800 as of 8 September 2020.
With its diverse culture, comes diverse food items. There are many restaurants in Dubai City, each belonging to different categories like Chinese, Indian, French, etc.

Target Audience

  • Business personnel who wants to invest or open a Coffee Shop.
  • Finding the best location for opening a Coffee Shop.
  • Budding Data Scientists, who want to implement some of the most used.
  • Exploratory Data Analysis techniques to obtain necessary data, analyze it and, finally be able to tell a story out of it.

Data Section

  1. For the above objective, we will be using open-data acquired from the Dubai Statistics Center. The data is available in the form of an Excel sheet, which will require a considerable amount of refinement. The data source is accessible at below location:

Data Source: https://www.dsc.gov.ae/Report/DSC_SYB_2019_01%20_%2002.xlsx

  • Description: This data set contains the required information. And we will use this data set to explore various neighborhoods of Dubai city.
  • https://en.wikipedia.org/wiki/Dubai

2. Coffee Shop in neighborhoods of Dubai city.

  • Data Source: Foursquare API
  • Description: By using this API we will get all the venues in the neighborhoods. We can filter these venues to get only Coffee Shop.

Approach

  • Collect the Dubai city data from https://www.dsc.gov.ae/Report/DSC_SYB_2019_01%20_%2002.xlsx.
  • Using Foursquare API we will get all venues for each neighborhood.
  • Filter out all venues which are Coffee Shop.
  • Analyzing using Clustering (Specially K-Means):
    1. Find the best value of K
    2. Visualize the neighborhood with a number of Coffee Shop.
  • Compare the Neighborhoods to Find the Best Place for Starting up a Coffee Shop.

Problem Statement

  1. What is the best location for an Coffee Shop in Dubai city?
  2. In what Neighborhood should I open an Coffee Shop?

2. Data Preparation:

I will use Dubai city data for this project.

2.1 Cleaning Data

Stage 2.2: Coordinates:

In stage 2, we will extract each community’s coordinates and append it to our data frame.

To minimize the time required to extract such information, we will be obtaining the coordinates of the top 100 communities with the highest population.

As you can see from above, it requires allot of efforts to make your data usable as per your requirement.

I will be saving this dataset and will publish this on Kaggle for anyone in future looking for top 100 communities in Dubai along with their population.

Stage 3: Mapping:

Let’s take a look at Dubai and based on our dataset, lets see where all these communities are.

For mapping, we will be using Folium.

Stage 4: Foursquare

Now that we have everything we need, let’s proceed to next step, i.e. Foursquare

Stage 5: Prepare & Analyze

Let’s start analyzing our data for each community and transform it so we can utilize it efficiently during ML process

Step 5: Prepare

Let’s prepare our data so it can conform to ML standards.

Categories provided by Foursquare are in label form, the machine learning algorithms cannot operate on label data directly. They require all input variables and output variables to be numeric.

To transform our data to numerical form, we will perform One Hot Encoding. This will transpose out Category lebels in to Features/Columns in our dataframe with value as 0 or 1.

Stage 6: Clustering & Analysis

Now let’s create clusters of communities based on where the Coffee Shop are situated. Once we have a visual of the cluster, we can start breaking those clusters and see how many are in each cluster.

Step 6.1: Clustering

Let’s cluster our communities into 5

Conclusion

As you can see from the above Folium map, Communities in cluster 3, marked as Green, do not have any ‘Coffee Shop,’ Which gives us a lot of choices to select where we want to start our business.

There are certainly fewer Coffee Shop in communities in cluster 2, 4 and there is a potential to have a successful business..

--

--