Comprehensive SQL-based analysis of Sydney's Airbnb market demonstrating advanced database querying, data exploration, and business insight generation. Uncovered pricing patterns, neighborhood insights, and actionable improvement opportunities for hosts.
Active Airbnb properties analyzed
Individual hosts managing properties
$958/night average price
11,287 of total listings
Shared rooms to luxury homes
Hosts with 'dirty' complaints identified
Setting up SQLite database, loading CSV data into tables, and establishing the foundation for SQL analysis.
import sqlite3
import pandas as pd
# Load the dataset
df = pd.read_csv('listings.csv')
# Create SQLite database connection
conn = sqlite3.connect('mytable.db')
# Save dataframe as SQL table
df.to_sql(name="listings", con=conn, if_exists='replace', index=False)
# Commit changes
conn.commit()
# Connect to database for SQL queries
%load_ext sql
%sql sqlite:///mytable.db
Database Connection Result:
Exploring the basic structure of Sydney's Airbnb market, including room types, total listings, and host distribution.
-- Explore different room types available
SELECT DISTINCT room_type
FROM listings;
-- Count total listings
SELECT COUNT(DISTINCT id) as total_listings
FROM listings;
-- Room type distribution
SELECT DISTINCT room_type, count(room_type) as listing_count
FROM listings
GROUP BY room_type;
Room Types Available:
Room Type Distribution:
Room Type | Count |
---|---|
Entire home/apt | 11,287 |
Private room | 2,823 |
Shared room | 91 |
Hotel room | 73 |
Analyzing average prices and review patterns across different room types to understand market positioning and guest preferences.
-- Average price and reviews by room type
SELECT DISTINCT room_type,
count(room_type) as listing_count,
ROUND(avg(price), 2) as avg_price,
ROUND(avg(number_of_reviews), 1) as avg_reviews
FROM listings
GROUP BY room_type
ORDER BY avg_price DESC;
Pricing & Review Analysis:
Room Type | Count | Avg Price | Avg Reviews |
---|---|---|---|
Entire home/apt | 11,287 | $464.29 | 36.8 |
Hotel room | 73 | $299.90 | 101.6 |
Private room | 2,823 | $134.31 | 27.1 |
Shared room | 91 | $115.99 | 7.1 |
Deep dive into neighborhood pricing patterns, host concentration, and market dynamics across Sydney's diverse areas.
-- Top 10 most expensive neighborhoods
SELECT DISTINCT neighbourhood,
count(id) as listing_count,
ROUND(avg(price), 2) as avg_price,
count(distinct host_id) as unique_hosts
FROM listings
GROUP BY neighbourhood
ORDER BY avg_price DESC
LIMIT 10;
Top 10 Most Expensive Neighborhoods:
Neighborhood | Listings | Avg Price | Hosts |
---|---|---|---|
Pittwater | 901 | $958.36 | 592 |
Woollahra | 449 | $688.73 | 311 |
Mosman | 191 | $539.48 | 135 |
Manly | 651 | $491.71 | 483 |
Waverley | 1,319 | $480.03 | 815 |
... 33 more neighborhoods |
Examining the composition of property types across neighborhoods to identify market positioning and opportunities.
-- Property type percentages by neighborhood (top 10)
SELECT neighbourhood,
ROUND((COUNT(CASE WHEN room_type = 'Entire home/apt' THEN 1 END) * 100.0 / COUNT(*)),1) AS pct_entire_home,
ROUND((COUNT(CASE WHEN room_type = 'Private room' THEN 1 END) * 100.0 / COUNT(*)),1) AS pct_private_room
FROM listings
GROUP BY neighbourhood
ORDER BY pct_entire_home DESC
LIMIT 10;
Property Mix by Neighborhood (Top 10):
Neighborhood | % Entire Home | % Private Room |
---|---|---|
Pittwater | 95.1% | 4.7% |
Manly | 92.9% | 7.1% |
Mosman | 91.1% | 8.9% |
Sutherland Shire | 89.6% | 9.9% |
Warringah | 89.1% | 10.8% |
... vs Bankstown: 43.7% entire homes, 55.7% private rooms |
Analyzing guest reviews to identify specific improvement opportunities for hosts, focusing on cleanliness issues as a concrete business case.
-- Load reviews data and join with listings
-- Find hosts with cleanliness complaints
SELECT host_id, host_name, name,
SUBSTR(comments, 1, 100) || '...' as comment_preview
FROM reviews
INNER JOIN listings ON listings.id = reviews.listing_id
WHERE comments LIKE '%dirty%'
LIMIT 5;
Hosts with Cleanliness Issues (Sample):
Host | Property | Issue Preview |
---|---|---|
Mirko | Sydney CBD Apartment | Place was dirty upon arrival, floors dirty... |
Tony | Pittwater Boat House | Property not cleaned, kitchen/bathroom dirty... |
John | Mosman Studio | Found it a bit too dirty, not cleaned well... |
+ Multiple other hosts identified |
Pittwater emerges as the premium market with $958/night average, while areas like Bankstown offer opportunities in the private room segment with 55.7% market share.
Entire homes command 4x premium over shared rooms ($464 vs $115/night). Hotel rooms generate highest review frequency (101 avg), indicating strong guest engagement.
Analysis identified specific hosts with cleanliness complaints, presenting concrete business opportunities for cleaning services and property management companies.
This project showcases SQL analysis skills and business intelligence capabilities.