More volatile, less equal. Housing and welfare in 21st century France

Supplemental material: Data, scripts, figures and additional analysis

Authors

Renaud Le Goix⁷

⁷ Université Paris Cité, UMR Géographie Cités 8504 CNRS - F75013,renaud.legoix@u-paris.fr

⁸ Lund University, Lund, Sweden, william.kutz@cors.lu.se

William Kutz⁸

Ronan Ysebaert⁹

⁹ Université Paris Cité, UAR RIATE CNRS F-75013, ronan.ysebaert@cnrs.fr

Published

November 12, 2025

About this notebook

This notebook serves as supplemental material of a paper published in Housing Studies. Please cite as : Le Goix R, Kutz B., Ysebaert R, 2025, More volatile, less equal. Housing and welfare in 21st century France, Housing Studies, DOI 10.1080/02673037.2025.2565248.

It details the sources and methodological framework, along the lines of reproducibility guidelines (open data and open science). The notebooks covers all R language code used to prepare the analysis, maps and figures, and also includes some supplemental material to further comment the datasets used and the detailled methodological choices. It allows full replicability of the study, when used with the original aggregated dataset. When relevant, the structure of the code follows the layout of the paper, and refers to sections and figures captions.

Online version: https://rlegoix.gitpages.huma-num.fr/housing_methodo_paper
GitLab repository: https://gitlab.huma-num.fr/rlegoix/housing_methodo_paper

1 Context and data

Our comparative analysis focuses on the city-regions of Paris, Lyon and Avignon for the years 2002-2018. Comprising 708 municipalities, each case is intended to reflect different market characteristics and positions in the French urban system. Our analysis draws on records of individual transactions made by households collected by the French Chamber of Notaries and stored on the BIEN and PERVAL databases. These are proprietary datasets that can be accessed at cost. While both have discrepancies in the structure of the data, their limitations are well documented and can be easily resolved (Casanova et al. 2017; Le Goix et al. 2021). The datasets contain information on the financial, demographic and material structure of each acquisition, that we aggregate at the municipal level. Beyond what one would expect to find in these records, such as the age and location of the property, its price, the type and duration of financing, the files also include information on the socio-professional categories of buyers and sellers.

To interpret diverging trajectories of housing wealth and vulnerability over the 2000s we developed a classification scheme using principal component analysis (PCA) and hierarchical cluster analysis (HCA) of the transaction data. Component and cluster analyses are used together to generate hierarchically structured summaries of data around a pivotal average profile. We use this approach to identify different categories of market regimes derived from the aforementioned financial, demographic and material information included in the property transaction datasets.

1.1 R libraries

In this section, we call R libraries used in this notebook.

Code

# Data handling
library(sf) # Spatial data handling
library(dplyr) # Some aggregates functions
library(reshape2) # Reshape
library(tidyverse) # Data handling
library(readxl) # Read XLS

# Data analysis
library(FactoMineR) # Multivariate analysis
library(factoextra) # Plot multivariate analysis outputs
library(fastcluster) # Clustering
library(TraMineR) # Sequence analysis
library(cluster) # Clustering (agnes)

# Display tables
library(kableExtra) # Table formatting

# Display maps
library(mapsf) # Thematic maps
library(maplegend) # Maps legend customization
library(slickR) # carousel for map displaying
library(leaflet) # Interactive Map
library(leaflet.extras) # Interactive mapping, search menu

# Sequence analysis
library(TraMineR) # Sequence analysis

# Data visualization (plots)
library(ggplot2) # Plots with ggplot
library(ggalluvial) # trajectories over time
library(ggthemes) # ggplot theme
library(viridis) # Colors for some plots
library(ggExtra) # ggplot 2 enhancements
library(vcd) # Visualizing categorical data

1.2 Data presentation: 23 core indicators

This chunk imports the reference dataset. For any further use of the dataset, make sure to include a full citation of the published paper.

Code

# Data 
df <- read.csv("data/data_input.csv", 
               colClasses = c("character", rep("numeric", 25)))

This .csv file include 23 core indicators gathered at 8 time stamps (2002, 2004, 2006, 2008, 2012, 2015, 2018) covering 708 municipalities in Paris region (Île-de-France), Lyon and Avignon Functional Areas.

The 23 core indicators include information on the financial, demographic and material structure of each acquisition, aggregated at the municipal level. Beyond the information one would expect to find in these records, such as the age and location of the property, its final sales price, as well as the type and amount of debt. The dataset also includes variables describing the socio-professional characteristics of buyers and sellers as well as the types of property being invested in :

Listing of consolidated indicators at municipal scale
id	Label	measure_unit	note
INSEE_COM	French municipality id	/	Reference year 2019
year	Reference year	/
nbapparts	Number of apartments	Real estate transactions
nbmaisons	Number of houses	Real estate transactions
pricem2apparts	Price per square meters, apartments	euros
pricem2maisons	Price per square meters, houses	euros
PIRapparts	Price to income ratio, apartments	months	Months of local median income required to buy 1sq. Meter
PIRmaisons	Price to income ratio, houses	months	Months of local median income required to buy 1sq. Meter
Debt2Pxapparts	Debt to price, apartments	%	Sum of debt contracted in the municipality / sum of price paid
Debt2Pxmaisons	Debt to price, houses	%	Sum of debt contracted in the municipality / sum of price paid
PctFamilyHomes	Share of family homes	%	6 rooms and more, as regard to real estate transactions that concern houses
PctSmallHouses	Share of houses with less than 2 rooms in real estate transactions	%	Less than 2 rooms, as regard to real estate transactions that concern houses
PctSmallApartments	Share of small apartments	%	Less than 2 rooms, as regard to real estate transactions that concern apartments
PctNewHomes	Share of houses built below 5 years before the real estate transaction	%
Pct1970Homes	Share of houses built in the 1970’s in real estate transactions	%
Pct1980Homes	Share of houses built in the 1980’s in real estate transactions	%
PctNewApartments	Share of apartment built below 5 years before the real estate transaction	%
BuyersWorkersEmployees	Share of workers and employees in the stock of buyers	%
BuyersExecutives	Share of highly qualified employees (executive, managers) in the stock of buyers of real estate transactions	%
BuyersRetired	Share of retirees in the stock of buyers	%
SellersWorkersEmployees	Share of workers and employees in the stock of sellers	%
SellersExecutives	Share of highly qualified employees (executive, managers) in the stock of sellers	%
SellersRetired	Share of retirees in the stock of sellers	%
Price_m2	Price per square meters, all	euros
PriceIncomeRatio	Price to income ratio, all	months	Months of local median income required to buy 1sq. Meter
Debt2Price	Debt to price, all	%	Sum of debt contracted in the municipality / sum of price paid

Table 1: Listing of consolidated indicators at municipal scale (partially used in the Table 1 of the published version)

1.3 Geometries and geographical coverage

We load geometries (municipalities of Île-de-France, Lyon and Avignon Functional Area) and map template parameters used for spatial analysis and mapping purpose.

Code

# Geometries
source(file = "scripts/map_template.R", encoding = "UTF-8")

# Import combined layers
comb <- getLayers(x = "Combined")
com <- comb$com
dep <- comb$dep
metro <- comb$metro
study <- comb$study

# Define available years
years <- unique(df$year)

# Manage labels
labels <- data.frame(name = c("Paris (Ile-de-France)", "Avignon (FUA)", "Lyon (FUA)"),
                     x = c(674000, 655000, 720000),
                     y = c(6899000, 6810000, 6840000))
labels <- st_as_sf(labels, coords = c("x", "y"), crs = 2154, agr = "constant")

# Map theme
mf_theme(bg = NA, fg = NA, mar = c(0, 0, 2, 0),
         tab = FALSE, pos = "left", inner = FALSE, line = 1.8, cex = 1.3,
         font = 2)

708 municipalities are covered by this dataset. In order to avoid outliers, we include in the analysis only municipalities that are characterised by at least 10 real estate transactions over the entire time span of the study (2002-2018).

Code

df_f <- unique(df$INSEE_COM)

# Paris
mf_init(com[com[["study"]] == "Paris",])
mf_inset_on(fig = c(0, 0.89, 0.3, 1))  
mf_map(study[study$study == "Paris",], col = "white", border = NA)
mf_map(com[com$study == "Paris" & com$INSEE_COM %in% df_f,], 
       col = "darkblue", border = "white", add = TRUE)
mf_map(dep, col = NA, border = "black", add = TRUE)
mf_inset_off()

# Lyon
mf_inset_on(fig = c(0.6, 1, 0, 0.48))
mf_map(study[study$study == "Lyon",], col = "white", border = NA)
mf_map(com[com$study == "Lyon" & com$INSEE_COM %in% df_f,], 
       col = "darkblue", border = "white", add = TRUE)
mf_map(metro, col = NA, border = "black", add = TRUE)
mf_map(study[study$study == "Lyon",], col = NA, border = "black", add = TRUE)
mf_scale(size = 20, col = "black")
mf_inset_off()
  
# Avignon
mf_inset_on(fig = c(0.35, 0.55, 0.05, 0.25))
mf_map(study[study$study == "Avignon",], col = "white", border = NA)
mf_map(com[com$study == "Avignon" & com$INSEE_COM %in% df_f,], 
       col = "darkblue", border = "white", add = TRUE)
mf_map(study[study$study == "Avignon",], col = NA, border = "black", add = TRUE)
mf_inset_off()
  
# Template
mf_credits(txt = paste0("Source : Base PERVAL / BIEN – Chambre des Notaires,",
                        "INSEE, IGN, 2021\n",  "Credits : Le Goix, Kutz, Ysebaert, 2025"), cex = .5)
mf_label(x = labels, var = "name", halo = FALSE, cex = .8, font = 4)

Figure 1: A sample of 708 municipalities

1.4 Data sources and preprocessing

Original data have been acquired from two separate providers, for a fee. These are commercial data, tabulating each transactions, distributed by the Chamber of Notaries databases: BIEN for Paris (Île-de-France), and Perval database for the rest of France. We consolidated both datasets during the project ANR Wisdhom (Wealth Inequalities and the Dynamics of Housing Market).

The original dataset is highly disaggregated. It covers individual real estate transactions and contains information on the property (location, number of rooms, surface, construction date, etc.) and the characteristics of the transaction (price paid, contracted debt, social category and age of the buyer and the seller, etc.). The detailed description of all available variables provided by these database can be found here. Municipal income is provided by INSEE (price to income ratio) and geometries by IGN.

The pre-processing steps of this highly detailed dataset is available, originally for team members of the entire project, and now released for research as open data. The entire documented procedure we followed to consolidate and aggregate the original data, with R scripts, can be found on this institutional repository. For the sake of readability of this document, we however summarize this procedure in the following sub-section.

1.4.1 Sample data selection

During the database consolidation phase, we elected to exclude incoherent transactions and outliers. This data clean-up process was a pre-requisite of the analysis, because as transactions have been recorded manually by each notarial office and clerks, they are know to include inconsistencies. See research on comparing real-estate data for more on the usages and expectations regarding sampling and representativity of real-estate databases in Casanova et al. (2017), and Le Goix et al. (2021).

For this project, we elected to exclude from the datasets the following records:

Transactions located outside the study areas (Paris, Lyon, Avignon), according to both X and Y coordinates and municipal codes.
Duplicated transactions (transaction id number)
Price outliers. Transactions with NA nominal seller prices, or price less than 1.
Transactions above or below the 1 % highest and lowest value thresholds for price or surface.
Deletion of transactions with an estimated living area below 1 sq meter.
Non apartments or houses (castles, service rooms, farm, etc.).
Physical / natural persons’s transactions only, as we focus on socio-economic characterisations of sellers and buyers. We excluded corporations and, generally speaking, juridical person’s transactions. As a caveat, corporate transactions are often poorly informed in these databases.

We also coped with some sampling management issues. Some of the dataset delivered by the providers (BIEN) were originally as a 50% sampled of all transactions (in municipalities above 10 000 inhabitants in Île-de-France, in 2011) ; whereas other municipalities are informed with a 100% coverage. We therefore had to apply a weight to transactions in municipalities with 50% samples.

Results of the data cleaning process can be found here.

1.4.2 Dataset aggregation at the municipal level

In the data_input.csv file, all required variables are aggregated at the municipal level (data aggregation based on geographic location of the transaction and/or municipal id). We followed some guidelines to manage missing values and municipalities with fewer real estate transactions :

Consolidated indicators represent the number of transactions per municipality (and per year) including non-missing or outlier values.
To avoid municipalities with scarce transactions, and subject to outliers, we only keep municipalities which account over the entire time period of analysis an average number minimum of 10 transactions.

2 Methodology: a typology of local markets (Risk, Renew, Reproduce)

2.1 Preliminary analysis

First, with regard to the financial structure of the data, we characterize an “average market” as an intermediate bracket of (i) transaction prices, (ii) purchasing power, and (iii) indebtedness. Specifically, we consider the average transaction price of the three city-regions as the mean cost per square meter of the sampled properties; average purchasing power is measured as a price-to-income ratio - in this case, the number of months an average buyer would need to purchase one square meter of real estate; finally, we measure average indebtedness as a debt-to-price ratio, represented as a percentage the overall transaction cost. In this context, the average price per square meter of real estate is 2,460 euros, which would take an average buyer 1.65 months of income to acquire, and leveraged at a 65% debt ratio.

2.2 Univariate analysis

Table 1 shows statistical summary for each of the 16 selected indicators, with number of observations (municipalities on the all time period, n), mean, standard deviation (sd), standard error (se), first quartile (Q0.25), median (Q0.5) and third quartile (Q0.75).

Code

univar <- psych::describe(df[,c(3:length(df))], range = FALSE, skew = FALSE, 
                          quant = c(.25,.5,.75))
univar$vars <- NULL
kable(univar, booktabs = T, digits=2)

	n	mean	sd	se	Q0.25	Q0.5	Q0.75
nbapparts	5664	79.86	169.45	2.25	6.00	23.00	77.00
nbmaisons	5664	17.49	16.45	0.22	7.00	13.00	22.00
pricem2apparts	5418	2926.39	1283.77	17.44	2141.20	2780.97	3338.30
pricem2maisons	5506	2840.14	1177.61	15.87	2138.77	2617.98	3139.32
PIRapparts	5418	1.77	0.75	0.01	1.31	1.68	2.00
PIRmaisons	5506	1.54	0.56	0.01	1.20	1.39	1.72
Debt2Pxapparts	5157	64.87	34.60	0.48	30.59	81.25	93.17
Debt2Pxmaisons	5189	58.85	33.58	0.47	26.32	68.60	88.92
PctFamilyHomes	5664	29.48	18.94	0.25	16.67	28.57	40.00
PctSmallHouses	5664	2.52	6.72	0.09	0.00	0.00	2.56
PctSmallApartments	5664	11.31	13.76	0.18	0.00	8.48	16.46
PctNewHomes	5664	4.34	11.30	0.15	0.00	0.00	0.00
Pct1970Homes	5664	21.05	20.00	0.27	2.52	16.67	32.20
Pct1980Homes	5664	16.25	17.62	0.23	0.00	12.50	25.00
PctNewApartments	5664	14.05	22.08	0.29	0.00	2.44	20.00
BuyersWorkersEmployees	5664	27.83	14.92	0.20	16.67	27.27	37.88
BuyersExecutives	5664	22.86	14.20	0.19	12.50	20.00	31.51
BuyersRetired	5664	7.00	6.15	0.08	2.60	6.12	10.00
SellersWorkersEmployees	5664	19.86	11.66	0.15	11.11	19.05	27.35
SellersExecutives	5664	17.84	11.55	0.15	9.52	16.67	25.00
SellersRetired	5664	23.63	11.16	0.15	16.67	23.53	30.04
Price_m2	5660	2860.01	1243.32	16.53	2115.95	2685.61	3172.04
PriceIncomeRatio	5660	1.65	0.72	0.01	1.24	1.50	1.81
Debt2Price	5575	62.31	31.14	0.42	29.76	76.34	89.33

Table 2: Univariate parameters of housing regime variables

2.3 Time series management and variable selection

Time-series analysis required the creation a new row.id, with year and GEOID : each spatial entity (cell) is described for each given year, in line.

Code

df$idYear <- paste0(df$INSEE_COM, df$year)

In this chunk, we select the variables of interest.

Code

cols <- c("PriceIncomeRatio","Debt2Price", "PctFamilyHomes",
          "PctSmallHouses","PctSmallApartments","PctNewHomes", "Pct1970Homes",
          "Pct1980Homes","PctNewApartments","BuyersWorkersEmployees", 
          "BuyersExecutives","BuyersRetired", "SellersWorkersEmployees",
          "SellersExecutives", "SellersRetired")

row.names(df) <- df$idYear

To provide the readership with preliminary univariate and bivariate analysis, we include histograms of each variables, and a correlation matrix.

Code

df[,cols] %>% 
  gather(Attributes, value, 1:13) %>% 
  ggplot(aes(x=value)) +
  geom_histogram(fill = "lightblue2", color = "black") + 
  facet_wrap(~Attributes, scales = "free_x") +
  labs(x = "Value", y = "Frequency")

Code

# Select only complete rows (no missing values)
df <- df[complete.cases(df[,cols]),]

# Prepare and melt the matrix
cormat <- round(cor(df[,cols]),2)
melted_cormat <- melt(cormat, na.rm = TRUE)

# Get lower triangle of the correlation matrix
  get_lower_tri<-function(cormat){
    cormat[upper.tri(cormat)] <- NA
    return(cormat)
  }
  # Get upper triangle of the correlation matrix
  get_upper_tri <- function(cormat){
    cormat[lower.tri(cormat)]<- NA
    return(cormat)
  }
  
#upper_tri
upper_tri <- get_upper_tri(cormat)  
melted_cormat <- melt(upper_tri, na.rm = TRUE)

# Heatmap
ggplot(data = melted_cormat, aes(Var2, Var1, fill = value))+
 geom_tile(color = "white")+
 scale_fill_gradient2(low = "blue", high = "red", mid = "white", 
   midpoint = 0, limit = c(-1,1), space = "Lab", 
   name="Pearson\nCorrelation") +
  theme_minimal()+ 
 theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1))+
 coord_fixed()

Findings

We look to the demographic structure of the transaction data as a way to describe the socio-economic composition of an average housing sub-market. In this case, an average market is typified by a mixed population of buyers and sellers: blue-collar workers and employees represent an average of 28% of buyers and 20% of sellers; executives represent on average 23% of buyers and 18% of sellers. The two categories tend to be the most active in sorting out discriminating buyers and sellers on the market, and their visibility is a often good indicator for distinguishing between exclusive and gentrifying neighborhoods versus those with greater socio-economic diversity. Furthermore, in patterns of wealth accumulation, retirement and age are key indicators that characterize several important ownership dynamics, such as multi-ownership (e.g. buy-to-let for revenue in retirement), selling and refinancing strategies to improve household equity (e.g. reverse mortgages); or intergenerational transfers of ownership, among others. In our sample, retirees represent 24% of all sellers, but only 7% of buyers listed in the transactions. We also include material and environmental aspects of the residential structures included in the transaction data, such as age and size of the property, their proximity to historical central neighborhoods and their connection to specific urbanization cycles. The component and cluster analyses allow us to distinguish areas with an over-representation of particular types of housing in a given municipality, and a few characteristics stand out: family homes (i.e. more than 6 rooms) represent on average 29.4% of transactions, while suburban homes built in the 1970s and 1980s during France’s initial transition towards a homeowner society account for 21% and 20% of recorded acquisitions, respectively. Additionally, we identified a preponderance of small houses, typically townhouses of less than 3 rooms in older, in former blue-collar inner suburbs, and small apartments, 1 bedroom or less, located in very central areas. Nevertheless, smaller properties only account for 2.5% of the total sampled properties while small apartments represented 11% of transactions. Finally, to characterize more recent patterns of urbanization and neighborhood renewal, we included data on newly constructed homes and apartments, which accounted for 4% and 14% of the recorded sales in the sample.

2.4 Clustering

To interpret diverging trajectories of housing wealth and vulnerability over the 2000s we developed a classification scheme using principal component analysis (PCA) and hierarchical cluster analysis (HCA) of transaction data. These are used together to generate hierarchically structured summaries of data around a pivotal average profile. We use this method to identify different categories of markets derived from the financial, demographic and material content included in the datasets

The cluster analysis is used to infer municipal contexts of annual transactions with regards to the correlations between (i) financial structures to the (ii) social structure of buyers and sellers, and the (iii) material structure of the built environment. We chose to focus on 10 market “profiles” derived from the cluster analysis. The cluster analysis reveals distinctive local market regime profiles, we summarize using the cluster dendrogram structure into 10 clusters (Figure 7, Figure 8), Figure 13 summarizes the z values for each variable, by clusters. These clusters are mapped, for each available year, on Figure 14.

2.4.1 Normalizing data

In principal components analysis, variables measured on different scales or on a common scale with widely differing ranges are often standardized. It is done by centering (subtracting the variable mean from an individual raw) and scaling (dividing the difference by the standard deviation of the indicator).

Code

# Scale results
df.norm <- scale(df[,cols])
kable(head(df.norm[,c(1:5)]), booktabs = T, digits = 2)

	PriceIncomeRatio	Debt2Price	PctFamilyHomes	PctSmallHouses	PctSmallApartments
010052002	-1.04	-0.30	-0.39	-0.38	-0.83
010322002	-0.75	0.21	0.56	-0.38	-0.83
010432002	-0.78	-1.07	-0.56	0.33	-0.83
010492002	-0.97	-0.90	-0.24	-0.38	-0.83
010922002	-1.29	-1.64	0.20	1.28	-0.83
011422002	-1.13	-1.18	-0.68	0.86	-0.83

Table 3: Standardized outputs, first rows and columns

2.4.2 Principal Component Analysis (PCA)

We first proceed with a PCA as a preliminary exploratory analysis, using FactoMineR library. Plot built with factoextralibrary. This biplot (Figure 4) shows the impact of each attribute of the 2 first components of the PCA, summarizing 31.3 % of the overall variance. It highlights the variables that are grouped together, which are positively correlated to each other. It shows also variables that are negatively correlated, which are displayed to the opposite sides of the biplot’s origin.

Code

# Compute PCA with ncp = 4
res.pca <- PCA(df.norm, ncp = 4, graph = FALSE)

# Plot indicator contribution of the 2 first factorial planes
fviz_pca_var(res.pca, col.var = "contrib", labelsize = 3) #Axes 1 and 2

Figure 4: Contribution on factorial planes 1-2

ame biplot with the third and the fourth components of the PCA (17.7 % of the overall variance)

Code

fviz_pca_var(res.pca, axes = c(3,4), col.var = "contrib", labelsize = 3) # Axes 3 and 4

Figure 5: Contribution on factorial planes 3-4

Findings

Of the three structures (financial, social, material) used in the cluster analysis we found that the financial structure of ownership (Pm2-PIR-DIR) explains more in terms of geographical splintering of clusters than the social structure of households. Social factors are certainly part of the first set of criteria, but they point to sub-structures within the main overarching financial framework.

2.4.3 A typology of market regimes

Using the fviz_nbclust we extract the optimal number of clusters (k=3). We also elect to use the HCA as an exploratory method of data structure, and to go deeper in the analysis of the hierarchical tree. We therefore analyze the interpretation of a 10 clusters solutions.

We use the fastcluster library, that provide faster computation times and comparable results, compared to the hclust library (Müllner 2013). We use the Ward method and Euclidean metrics.

Code

# Initiate typology
pca_coords <- res.pca$ind$coord[, 1:4]  # Get factors 1 to 4

clust <- hclust.vector(pca_coords, method = "ward", metric = "euclidean", p = NULL)

# Analyse 
clust$height <- clust$height^2
sortedHeight <- sort(clust$height, decreasing = TRUE) 
relHeight <- sortedHeight / sum(sortedHeight) *100
cumHeight <- cumsum(relHeight)

# Some plots to visualize outputs
par(mfrow=c(2,3))

barplot(sortedHeight[1:30], xlab = "Node", ylab = "Aggregation level",
        col = "black", border = "white")

barplot(relHeight[1:100], names.arg = seq(1, 100, 1),
        col = "black", border = "white", xlab = "Node", ylab = "Share of total inertia (%)")

barplot(relHeight[1:15], names.arg = seq(1, 15, 1),
        col = "black", border = "white", xlab = "Node", ylab = "Share of total inertia (%)")

barplot(cumHeight[1:100], col = "black", names.arg = seq(1, 100, 1),
        xlab = "Nb of clusters",
        ylab = "Share of total inertia (%)")

barplot(cumHeight[1:20], names.arg = seq(1, 20, 1), col = "black", border = "white",
        xlab = "Nb of clusters",
        ylab = "Share of total inertia (%)")

plot(clust, main = "Clusters dendrogram", 
     xlab= "n=5664", sub = "", labels = FALSE)

Figure 6: Cluster analysis inertia statistics

Code

pca_coords <- res.pca$ind$coord
fviz_nbclust(pca_coords, hcut, method = "gap_stat", k = 15)

Figure 7: Optimal Number of Clusters - Gap statistics

Code

# 10 classes
cahPal10 <- c("lightyellow","#c7e9c0","#005a32","#fcae91","#fb6a4a","#cb181d",
              "#33a02c","#1f78b4","#6a3d9a", "#084594")

# 3 classes
cahPal3 <- c("#e31a1c","#33a02c", "#1f78b4")

This section of code draws the dendrograms for both the k=10 and k=3 (risk, renew, reproduce) solutions. This supplemental material file contains the original dendrogram outputs, that have been stylized in the published version for better readability as Figure 2. The dendrogram provides a thematic overview of the three market regimes (risk, renew, reproduce) and the variations of their 10 sub-market trajectories.

Code

clus <- cutree(clust, k = c(3,10)) 
colnames(clus) <- c("clus3", "clus10")
dend <- as.dendrogram(clust)

# Dendrogram, following method by https://www.davidzeleny.net/anadat-r/doku.php/en:hier-agglom_examples#example_2ward_cluster_algorithm_applied_on_barro_colorado_island_data

#### Dendrogram for the 10 clusters solutions ####
par(mfrow = c(1,1))
groups <- cutree(clust, k = 10) 
#clus

# Ordering cluster IDs according to cluster order in th
#clust$order
group.order <- groups[clust$order]
#group.order

group.in.cluster <- unique (group.order)

#Checking cluster order
#group.in.cluster
# [1] 10  9  5  1  7  6  3  8  2  4

#Manually assigning colors by cluster order
cahPal10_tree <- c("#084594","#6a3d9a","#1f78b4","#fcae91","#fb6a4a","#cb181d",
                   "lightyellow","#005a32","#33a02c","#c7e9c0")
plot(dend, 
     main="10 clusters solution - Dendrogram", leaflab = "none")
rect.hclust (clust, border = cahPal10_tree, k = 10) 
legend ('topright', legend = paste ('Cluster', group.in.cluster), pch = 15,
        col = cahPal10_tree, bty = 'y', cex = .5)

Figure 8: Classification tree (dendrogram), 10 clusters and solutions for an analytical framework: risk, renew, reproduce (**Figure 2 in the published version**)

Code

#### Dendrogram for the 3 clusters solutions ####
groups <- cutree(clust, k = 3) 
#clus

# Ordering cluster IDs according to cluster order in th
#clust$order
group.order <- groups[clust$order]
#group.order

group.in.cluster <- unique (group.order)

#Checking cluster order
#group.in.cluster
# [1] 3 1 2

#Manually assigning colors by cluster order
cahPal3_tree <- c("#1f78b4","#e31a1c", "#33a02c")
plot(dend, 
     main="3 clusters solution - Dendrogram", leaflab = "none")
rect.hclust (clust, border = cahPal3_tree, k = 3) 
legend ('topright', legend = paste ('Cluster', group.in.cluster), pch = 15,
        col = cahPal3_tree, bty = 'y', cex = .5)

Figure 9: Classification tree (dendrogram), 3 clusters and solutions for an analytical framework: risk, renew, reproduce (**Figure 2 in the published version**)

Color schemes are prepared for all plots and maps. Color scheme does not follow the statistical structure of the dendrogram but gives a better interpretation of class outputs for the results (see below).

For further plots and mapping purposes, we combine the original datasets, the normalized variables and the clusters definitions. We use these files to prepare diagrams that provide the nominal and normalized values for each variables and each clusters, as interpretative figures. Only z-values for the 10 clusters solutions are included in the paper as Figure 3.

Code

# Append dataframe with results
df <- data.frame(df, clus)
# For further analysis
save(df, file = "data/df_with_clusters2025.RData")

df.norm <- data.frame(df.norm, clus)

simplifiedprofile <- df %>%
  group_by(clus10) %>%
  summarise(across(where(is.numeric), mean, na.rm = TRUE))

A function is defined to reduce the code…

Code

# Function to create profiles on normalized values average and extract table for central profile
prof <- function(x, var, clus, pal, type){
  prof <- aggregate(x[,var], by = list(x[,clus]), FUN = mean)
  colnames(prof)[1] <- "CLUSTER"
  
  prof <- melt(prof, id.vars = "CLUSTER")
  prof$clusters <- as.factor(prof$CLUSTER)
  prof$variable <- as.factor(prof$variable)

  # plot diagram
  if(type == "indicator"){
      g <- ggplot(prof) +
        geom_bar(aes(x = CLUSTER, y = value, fill = clusters), stat = "identity") + 
        facet_wrap(~variable,scales = "free") + coord_flip() + 
        scale_fill_manual(values = pal) +
        theme(text = element_text(size=10), 
              axis.text.y = element_text(size= 6),
              axis.text.x = element_text(size = 6))
  }
  
   if(type == "cluster"){
      g <- ggplot(prof) +
        geom_bar(aes(x = variable, y = value, fill = clusters), stat = "identity") + 
        facet_wrap(~ CLUSTER)+ 
        coord_flip() + 
        scale_fill_manual(values = pal) +
        theme(text = element_text(size=10), 
              axis.text.y = element_text(size= 6),
              axis.text.x = element_text(size = 6))
  }
  
  return(g)

}

…And applied by variables…

Code

prof(x = df, var = cols, clus = "clus3", pal = cahPal3, type = "indicator")

Figure 10: Hierarchical Cluster Analysis, 3 classes, absolute values

Code

prof(x = df, var = cols, clus = "clus10", pal = cahPal10, type = "indicator")

Figure 11: Hierarchical Cluster Analysis, 10 classes, absolute values

And clusters (z-normalized).

Code

prof(x = df.norm, var = cols, clus = "clus3", pal = cahPal3, type = "cluster")

Figure 12: Hierarchical Cluster Analysis, 3 classes, z values

Code

prof(x = df.norm, var = cols, clus = "clus10", pal = cahPal10, type = "cluster")

Figure 13: Hierarchical Cluster Analysis, 10 classes, z values (**Figure 3 in the published version**)

Cluster: 1. Risk. Moderate risk socially-mixed mature suburbs; 2. Renew. Suburban single-family homes; 3. Renew. New-build suburbs; 4. Risk. Middle-class suburban single-family homes; 5. Risk. Working-class suburban single-family homes; 6. Risk. Working-class inner-suburbs; 7. Renew. Recycled wealth in suburbs; 8. Reproduce. Exclusive central and suburban markets; 9. Reproduce. Late gentrified central districts and inner suburbs of Paris; 10. Reproduce. Central Paris markets.

Table 4 shows how the 10 clusters are distributed compared to the 3 overarching clusters (number of municipalities by cluster for the overall time-period).

Code

crosstable <- table(df.norm$clus10, df.norm$clus3)

crosstable2 <- addmargins(crosstable)

kable(crosstable2, booktabs = TRUE, row.names = TRUE, 
      caption = "")

	1	2	3	Sum
1	608	0	0	608
2	0	532	0	532
3	0	350	0	350
4	864	0	0	864
5	651	0	0	651
6	549	0	0	549
7	0	475	0	475
8	0	0	736	736
9	0	0	686	686
10	0	0	124	124
Sum	2672	1357	1546	5575

Table 4: Distribution of municipalities in the typology clusters within the 10*3 analytical framework categories

Code

# To do some misc counts
crosstable <- table(df$clus10[df$year == 2002], df$clus3[df$year == 2002])
#crosstable <- addmargins(crosstable)
#crosstable

# Print pct
crosstable <- prop.table(crosstable)
crosstable <- addmargins(crosstable)
crosstable

crosstable3 <- table(df$clus3[df$year == 2018], df$clus10[df$year == 2018])
crosstable3 <- prop.table(crosstable3)
crosstable3 <- addmargins(crosstable3)
crosstable3

3 Applying the framework

3.1 A geography of market regimes

Code

# Get pal function to keep only colors included in the given year / case study
get_pal_typo <- function(x, study){
  x <- x[x[["study"]] == study,]
  x <- x[order(match(x[,"clusters", drop = TRUE], coldf$val)),] 
  xx <- st_set_geometry(x, NULL)
  xx <- coldf[coldf$val %in% xx[,"clusters"],]
  return(xx$cahpal)
}

Two series of maps are created, for both the 3 clusters solution and the 10 cluster solution. Maps have been prepared with the mapsf library, and we use a carousel data vizualization using the slickR library.

Code

# Merge clusters outputs by year
for (i in 1:length(years)){
  com <- merge(com, 
               df[,c("INSEE_COM", "clus3")][df$year == years[i],],
               by = "INSEE_COM", all.x = TRUE)
  names(com)[length(com)-1] <- paste0("clus3_", years[i])
}

# Legend
coldf <- data.frame(cahpal = cahPal3,
                    val = unique(df$clus3[order(df$clus3), drop = T]),
                    leg_val = c("Risk", "Renew","Reproduce"))

# Create map serie for each year
for (i in 1:length(years)){
  
  # Keep study area and typology for the year
  x <- com[,c(12, grep(paste0("clus3_", years[i]), names(com)))]
  colnames(x)[2] <- "clusters"
  x <- x[order(x[,"clusters", drop = TRUE]),]
  
  # Map export PNG for web applications
  mf_export(com[com[["study"]] == "Paris",], 
            file = paste0("fig/typo_3clusters_",
                          years[i], ".png"), width = 1200)

  # Paris
  mf_init(com[com[["study"]] == "Paris",])
  mf_inset_on(fig = c(0, .89, 0.3, 1))
  pal <- get_pal_typo(x = x, study = "Paris")
  mf_map(study[study$study == "Paris",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Paris",], type = "typo", var = "clusters", pal = pal,
         leg_pos = NA, add = TRUE)
  mf_map(dep, col = NA, border = "black", add = TRUE)
  mf_inset_off()
  
  # Lyon
  mf_inset_on(fig = c(0.6, 1, 0, 0.48))
  pal <- get_pal_typo(x = x, study = "Lyon")
  mf_map(study[study$study == "Lyon",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Lyon",], type = "typo", var = "clusters", pal = pal,
       leg_pos = NA, add = TRUE)
  mf_map(metro, col = NA, border = "black", add = TRUE)
  mf_map(study[study$study == "Lyon",], col = NA, border = "black", add = TRUE)
  mf_scale(size = 20, col = "black")
  mf_inset_off()

  # Avignon
  mf_inset_on(fig = c(0.35, 0.55, 0.05, 0.25))
  pal <- get_pal_typo(x = x, study = "Avignon")
  mf_map(study[study$study == "Avignon",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Avignon",], type = "typo", var = "clusters", pal = pal,
         leg_pos = NA, add = TRUE)
  mf_map(study[study$study == "Avignon",], col = NA, border = "black", add = TRUE)
  mf_inset_off()
  
  # Template
  leg(type = "typo", title = "", val = coldf$leg_val, pal = coldf$cahpal,
      pos =  c(581000, 6850000), box_cex = c(1.5,1.5), size = 1.2)
  mf_credits(txt = paste0("Source : Base PERVAL / BIEN – Chambre des Notaires,",
                          "INSEE, IGN, 2021\n",  "Credits: Le Goix, Kutz, Ysebaert, 2025"), cex = .8)
  mf_label(x = labels, var = "name", halo = FALSE, cex = 1.2, font = 4)
  mf_layout(title = paste0("Classification of market regimes - ", years[i]), 
            scale = FALSE, arrow = FALSE, frame = TRUE, credits = "")
  dev.off()
}

Code

dir <- "fig"
files <- list.files(dir)
sel <- paste(dir, files[grep("3clus", files)], sep = "/")

slickR(obj = sel, height = 500, width = "80%")  + 
  settings(dots = TRUE, autoplay = TRUE, fade = TRUE, speed = 100)

Figure 14: Typology of market regimes in Paris, Lyon and Avignon functional urban areas, 3 clusters

Code

clus10_labels <- c("1. Risk. Moderate risk in socially-mixed mature suburbs", 
                   "2. Renew. Suburban single-family homes as investments",
                   "3. Renew. New-build suburbs",
                   "4. Risk. Middle-class suburban single-family homes", 
                   "5. Risk. Working-class suburban single-family homes",
                   "6. Risk. Working-class inner-suburbs",
                   "7. Renew. Recycled wealth in suburbs", 
                   "8. Reproduce. Exclusive central and suburban markets",
                   "9. Reproduce. Late gentrified central districts and inner suburbs",
                   "10. Reproduce. Central Paris market")

# Merge clusters outputs by year
for (i in 1:length(years)){
  com <- merge(com, 
               df[,c("INSEE_COM", "clus10")][df$year == years[i],],
               by = "INSEE_COM", all.x = TRUE)
  names(com)[length(com)-1] <- paste0("clus10_", years[i])
}

# Legend
coldf <- data.frame(cahpal = cahPal10,
                    val = unique(df$clus10[order(df$clus10), drop = T]),
                    leg_val = clus10_labels)

# Create map serie for each year
for (i in 1:length(years)){
  
  # Keep study area and typology for the year
  x <- com[,c(12, grep(paste0("clus10_", years[i]), names(com)))]
  colnames(x)[2] <- "clusters"
  x <- x[order(x[,"clusters", drop = TRUE]),]
  
  # Export PNG files for web applications
  mf_export(com[com[["study"]] == "Paris",], 
            file = paste0("fig/typo_10clusters_",
                          years[i], ".png"), width = 1200)
  
  # Paris
  mf_init(com[com[["study"]] == "Paris",])
  mf_inset_on(fig = c(0, .89, 0.3, 1))
  pal <- get_pal_typo(x = x, study = "Paris")
  mf_map(study[study$study == "Paris",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Paris",], type = "typo", var = "clusters", pal = pal,
         leg_pos = NA, add = TRUE)
  mf_map(dep, col = NA, border = "black", add = TRUE)
  mf_inset_off()
  
  # Lyon
  mf_inset_on(fig = c(0.6, 1, 0, 0.48))
  pal <- get_pal_typo(x = x, study = "Lyon")
  mf_map(study[study$study == "Lyon",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Lyon",], type = "typo", var = "clusters", pal = pal,
       leg_pos = NA, add = TRUE)
  mf_map(metro, col = NA, border = "black", add = TRUE)
  mf_map(study[study$study == "Lyon",], col = NA, border = "black", add = TRUE)
  mf_scale(size = 20, col = "black")
  mf_inset_off()

  # Avignon
  mf_inset_on(fig = c(0.35, 0.55, 0.05, 0.25))
  pal <- get_pal_typo(x = x, study = "Avignon")
  mf_map(study[study$study == "Avignon",], col = "lightgrey", border = NA)
  mf_map(x[x[["study"]] == "Avignon",], type = "typo", var = "clusters", pal = pal,
         leg_pos = NA, add = TRUE)
  mf_map(study[study$study == "Avignon",], col = NA, border = "black", add = TRUE)
  mf_inset_off()
  
  # Template
  leg(type = "typo", title = "", val = coldf$leg_val, pal = coldf$cahpal,
      pos =  c(581000, 6850000), box_cex = c(1.5,1.5), size = 1.2)
  mf_credits(txt = paste0("Source : Base PERVAL / BIEN – Chambre des Notaires,",
                          "INSEE, IGN, 2021\n",  "Credits: Le Goix, Kutz, Ysebaert, 2025"), cex = .8)
  mf_label(x = labels, var = "name", halo = FALSE, cex = 1.2, font = 4)
  mf_layout(title = paste0("Classification of market regimes - ", years[i]), 
            scale = FALSE, arrow = FALSE, frame = TRUE, credits = "")
  dev.off()
}

Our methodological approach extrapolates divergences in pathways of accumulated housing wealth and vulnerability longitudinally. The financial structure of the transactions is instructive: different combinations of property prices, purchasing power and indebtedness derived from the cluster analysis reveals three distinct and coherent housing regimes. When considered over time, these groupings and their substructures illustrate diverging pathways of accumulated household wealth and vulnerability: one towards increasing vulnerability (risk), another towards the consolidation and transmission of residential wealth (reproduce), and a third dynamic, between the two, evoking a mode of uncertain wealth renewal. Figure 15 outlines the financial parameters structuring the three regimes.

Code

# Import municipal income and format variable
inc <- read_csv("data/income.csv", 
                col_types = cols(INSEE_COM = col_character()))
inc <- melt(inc, id.vars = 'INSEE_COM')
inc$idYear <- paste0(inc$INSEE_COM, substr(inc$variable, 12, 15))
names(inc)[3] <- "Med_Income"

## Select indicators
df_sel <- df[,c("idYear", "year","clus3","PriceIncomeRatio","Debt2Price", "Price_m2")]
df_sel <- merge(df_sel, inc[,c("idYear", "Med_Income")], by = "idYear", all.x = TRUE)

# Quartiles
df_sel <- df_sel %>% mutate(Med_Income_dec = ntile(Med_Income, 4)) 
df_sel <- df_sel %>% mutate(price_dec = ntile(Price_m2, 4)) 

# Group by median income and price m² deciles
synth <- df_sel %>%
    group_by(year, Med_Income_dec, price_dec, clus3) %>% 
    summarise(across(.cols = is.numeric, 
                     .fns = list(Mean = mean), na.rm = TRUE, 
                     .names = "{col}"))

synth$year <- as.factor(synth$year)

# Rename labels
synth$Med_Income_dec[synth$Med_Income_dec == '1'] <- 'Lower income Q1'
synth$Med_Income_dec[synth$Med_Income_dec == '2'] <- 'Q2'
synth$Med_Income_dec[synth$Med_Income_dec == '3'] <- 'Q3'
synth$Med_Income_dec[synth$Med_Income_dec == '4'] <- 'Higher income Q4'

synth$Med_Income_dec <- as.factor(synth$Med_Income_dec)
synth$Med_Income_dec <- ordered(synth$Med_Income_dec, c("Lower income Q1", "Q2", "Q3", "Higher income Q4"))

cluster_names <- as_labeller(
     c(`1` = "Risk", `2` = "Renew",`3` = "Reproduce"))

# Select indicators 
df_sel <- df[,c("year","clus10","clus3", "PriceIncomeRatio","Debt2Price", "Price_m2")]

# Group by year and 10 clusters (mean)
synth2 <- df_sel %>%
    group_by(year, clus10) %>% 
    summarise(across(.cols = is.numeric, 
                     .fns = list(Mean = mean), 
                     na.rm = TRUE, 
                     .names = "{col}_{fn}"))

# Group by year and 10 clusters (standard deviation)
synth_tmp <- df_sel %>%
    group_by(year, clus10) %>% 
    summarise(across(.cols = is.numeric, 
                     .fns = list(sd = sd), 
                     na.rm = TRUE, 
                     .names = "{col}_{fn}"))

# Merge outputs
synth2 <- merge(synth2, synth_tmp, all.x = TRUE)

# Data handling
synth2$clus10 <- as.factor(synth2$clus10)
synth2$clus3 <- as.factor(synth2$clus3_Mean)

# Price /sq
p1 <-  ggplot(data = synth2, aes(x = year, group = clus10)) +
  geom_line(aes(y = Price_m2_Mean, color = clus10)) + 
  geom_ribbon(aes(y = Price_m2_Mean, 
                  ymin = Price_m2_Mean - Price_m2_sd,
                  ymax = Price_m2_Mean + Price_m2_sd, fill = clus10), 
              alpha = .1) +
  scale_y_log10() +
  labs(y = "Price / sqm", x = "year") +
  scale_color_manual(values = cahPal10) +
  scale_fill_manual(values = cahPal10) +
  theme(axis.text = element_text(size = 8), 
        axis.title = element_text(size = 8),
        plot.title = element_text(size = 10), 
        legend.title = element_blank()) +
  facet_wrap(.~clus3, labeller = labeller(clus3 = cluster_names)) 

# Price to income
p2 <-  ggplot(data = synth2, aes(x = year, group = clus10)) +
  geom_line(aes(y = PriceIncomeRatio_Mean, color = clus10)) + 
  geom_ribbon(aes(y = PriceIncomeRatio_Mean, 
                  ymin = PriceIncomeRatio_Mean - PriceIncomeRatio_sd, 
                  ymax = PriceIncomeRatio_Mean + PriceIncomeRatio_sd,
                  fill = clus10), alpha = .1) +
  labs(y = "Price to Income Ratio", x = "year") +
  scale_color_manual(values = cahPal10) +
  scale_fill_manual(values = cahPal10) +
  theme(axis.text = element_text(size = 8), 
        axis.title = element_text(size = 8),
        plot.title = element_text(size = 10)) +
  facet_wrap(.~clus3, labeller = labeller(clus3 = cluster_names))+
  theme(strip.text.x = element_blank())

# Debt to price
p3 <- ggplot(data = synth2, aes(x = year, group = clus10)) +
  geom_line(aes(y = Debt2Price_Mean, color = clus10)) + 
  geom_ribbon(aes(y = Debt2Price_Mean, 
                  ymin = Debt2Price_Mean - Debt2Price_sd, 
                  ymax = Debt2Price_Mean + Debt2Price_sd, 
                  fill = clus10), alpha = .1) +
  labs(y = "Debt to Price", x = "year") +
  scale_color_manual(values = cahPal10) +
  scale_fill_manual(values = cahPal10) +
  theme(axis.text = element_text(size = 8), 
        axis.title = element_text(size = 8),
        plot.title = element_text(size = 10)) +
  facet_wrap(.~clus3, labeller = labeller(clus3 = cluster_names))+
  theme(strip.text.x = element_blank())

# Print output
print(ggpubr::ggarrange(p1, p2, p3, ncol= 1,nrow = 3, 
                        common.legend = TRUE, legend = "bottom"))

Figure 15: Dynamics of financial parameters in household’s transactions in defining three pathways for unequal accumulation (**Figure 4 in the published version**)

Explanations

Risk : lower than average price/income & higher than average debt/income
Renew : lower than average price/income & lower than average debt/income
Reproduce : higher than average price/income & average debt/income

Code

sel <- paste(dir, files[grep("10clus", files)], sep = "/")

slickR(obj = sel, height = 500, width = "80%")  + 
  settings(dots = TRUE, autoplay = TRUE, fade = TRUE, speed = 100)

Figure 16: Selected maps of the typology of market regimes in Paris, Lyon and Avignon functional urban areas - HCA, ward, 10 clusters, inertia 60 % (Figure 5 in the published version)

Findings

Represented in reddish colors, risk trajectories predominate in markets distinguished by inexpensive affordable properties, and low, if not stagnant, price inflation since 2006. Risk is therefore located found where affordability (lower price-to-income ratios) is tethered to high levels of indebtedness.

Cluster 1.Risk. Moderate risk in mixed mature suburbs (middle-class affordable / high debt investments in mixed suburbs, average profile)
Cluster 4.Risk in middle-class single family homes suburbs (with an over-representation of retired as sellers, and blue collars and employees as buyers)
Cluster 5. Risk. Blue collars and employees single family homes suburbs (very high debt-to-price ratios)
Cluster 6. Risk. Blue collars and employees inner-suburbs (very high debt-to-price ratios)

At the other end of the spectrum is a trajectory pathway characterized by the consolidation of household wealth (purple to bluish colors, Figure 14 to Figure 16). Absolute unaffordability, steady price inflation and a disproportionate presence of executives and managers, both as buyers and sellers, are paramount to understanding this residential trajectories cluster profiles. Acquisitions are marked distinguished by exceptionally high price-to-income and debt-to-income ratios, which suggests that buyers cannot afford to purchase property in this segment without recourse to personal wealth recycled through the sale of other assets or intergenerational wealth transfers – a tendency that has also been instrumental in fuelling local price inflation since the Great Recession.

Cluster 10. Reproduce - central Paris market (smaller and older apartments, very high debt to price ratio)
Cluster 9. Reproduce. Unaffordable central neighborhoods and inner suburbs (higher percentage of small houses and apartments. Workers and employees are also active buyers and sellers in these central submarkets).
Cluster 8. Reproduce / Gentrified former blue collar and suburban markets (high debt to price, high percentage of family homes)

The third trajectory characterizes a set of asset valorization strategies asset capitalization strategies by middle- and upper-middle class buyers and sellers in areas with an over-representation of new single-family homes and small properties. Activity in this segment, marked by a combination of lower indebtedness, lower prices, and higher affordability, indicates that buyers rely on larger down payments either from savings, from the resale of other property or from from intergenerational wealth transfers for to realize acquisitions. We use the concept of renewal in an attempt to capture a series of investment characteristics that display to varying degrees the ideal-typical pattern asset-based welfare valorization, but also processes of gentrification, build-to-let investment, and petty rentierism without necessarily a clear distinction between them (Peris and Casanova Enault, 2023; Hochstenbach and Aalbers, 2023).

Cluster 2. Renew. Suburban SFH investments. Investments in the 1970s-80s Single Family Homes Suburbs, characterized by a lower debt-to-price investments, with new apartments contributing to densifying trends and renewal.
Cluster 3. Renew. New built suburbs (New apartments and new homes in mixed suburbs, with an over-representation of retired among buyers).
Cluster 7. Renew. Suburban investments recycling wealth is (very low debt-to-price ratios), with an over-representation of retired among buyers and sellers.

3.2 Uneven and combined volatility

In this section of the paper, we consider volatility as two distinct but related phenomena: first, as the frequency of change in the profile of market activity within a given housing regime, and; second, as the degree to which this activity switches between regimes of asset-based risk, renewal and reproduction.

We synthesized changes among the 708 municipalities into a flow chart (Figure 17) with three timestamps demarcating key policy changes in the development of asset-based welfare in France :

2002-2008 : the onset of the housing bubble in Europe and North America;
2008-2012 : the expansion of tax-deductions for first time homebuyers in the wake of the Great Recession;
2012-2018 : the expansion of fiscal incentives for private multi-ownership investors to expand the stock of rental housing.

To this end, we produced a plot using ggplot2, ggalluvial and ggthemes libraries. It shows the trajectories of the municipalities over time across the 10 classes of the typology.

Code

# Data preparation
df1 <- df[df$year == 2002, c("INSEE_COM", "clus10")]
names(df1)[2] <- "y_2002"
df2 <- df[df$year == 2008, c("INSEE_COM", "clus10")] 
names(df2)[2] <- "y_2008"
df3 <- df[df$year == 2012, c("INSEE_COM", "clus10")] 
names(df3)[2] <- "y_2012"
df4 <- df[df$year == 2018, c("INSEE_COM", "clus10")] 
names(df4)[2] <- "y_2018"
df_list <- list(df1, df2, df3, df4)

# Combine 
trans <- Reduce(function(x, y) merge(x, y), df_list)

# Count
trans <- count(trans, y_2002, y_2008, y_2012, y_2018) %>% 
  mutate(id = row_number()) # create new id variable

# make long data
trans <- gather(trans, value, key, -n, -id) 

# clean up data for graph
trans <- trans %>%
  mutate(year = as.numeric(str_remove(value, "y_")),
         key = as.factor(key),
         key = fct_relevel(key, "1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) 

# Plot 
# full graph syntax
p <- trans %>%
  ggplot(aes(x = year, stratum = key, alluvium = id, y = n, fill = key
  )) +
  theme_tufte(base_size = 12) +
  scale_fill_manual(values = cahPal10,
                    labels = clus10_labels) +
  geom_flow() +
  scale_x_discrete(name ="Year", 
                   limits = c(2002,2008,2012, 2018)) +
  geom_stratum(alpha = .5) +
  labs(x = "Years",
       y = "Municipality number",
       fill = NULL) +
  theme(legend.position = "bottom")

# Legend parameter
param_legend <- function(myPlot, pointSize, textSize, spaceLegend, nrow) {
  myPlot +
    guides(shape = guide_legend(override.aes = list(size = pointSize)),
           color = guide_legend(override.aes = list(size = pointSize)),
           fill= guide_legend(nrow = nrow,byrow = FALSE)) +
    theme(legend.title = element_text(size = textSize), 
          legend.text  = element_text(size = textSize),
          legend.key.size = unit(spaceLegend, "lines"))
}

param_legend(p, pointSize = 9, textSize = 9, spaceLegend = .4, nrow = 5)

Figure 17: Stability and change in local market regimes, 2002-2018 (**Figure 6 in the published version**)

4 Cross-tabulating market devices and the typology

We test the three clusters risk / renew / reproduce categories against the different categories of housing policy instruments defined according to ABC Zoning (2003;2008;2012;2018), over the main time-stamps for which policy data are available. We then summarize our findings according to which policies were implemented for each zone. We use for this section the aggregated files prepared by Le Brun (2022), available online :

Research report: https://github.com/riateStage/dispositif_lebrun/blob/main/Memoire_Pierre_Le-Brun.pdf
Methodology: https://riatestage.github.io/dispositif_lebrun/
GitHub repository: https://github.com/riateStage/dispositif_lebrun

The ABC zoning was created in 2003 as part of the rental investment scheme known as “Robien”. It has since been revised in 2006, 2009 and 2014. ABC zoning is used in particular for the scope of eligibility and for the applicable scales (rent and/or resource ceilings) for aid relating to rental investment (Denormandie, Loc’avantages), to home ownership (social rental-purchase loan, zero-rate loan, reduced-rate VAT in ANRU zones and priority city districts, real solidarity lease), as well as for intermediate rental housing and for setting rent ceilings for social housing financed by PLS.

Code

# NB using code written by P. Le Brun, from the github doc, to tidy up the elibibility zoning.
load("data/ExportABC.RData")

#Short cluster labels in a vector
short.labels10 <- c("1 Rsk_Mid-class burbs", 
                                "2.Rnw New suburb apts",
                                "3.Rnw SFH",
                                "4.Rnw Mixed", 
                                "5. Rsk SmallApts",
                                "6. Rsk Mature",
                                "7. Rsk Struggling", 
                                "8. Rnw Suburb gentrif",
                                "9. Rprd Exclusive",
                                "10. Rptd Dwtn Paris")

short.labels3 <- c("1. Risk", "2.Renew", "3.Reproduce")

Code

## Selecting and merging variables
Device <- ExportABC[,c("CODGEO_2003","ZONE_ABC_2003")]
Device <- rename(Device, INSEE_COM = CODGEO_2003)
Device <- rename(Device, ABC_Zones = ZONE_ABC_2003)

Typology <- com[,c("INSEE_COM","clus3_2004")]
Typology <- rename(Typology, "Typo" = "clus3_2004")

fortest <- left_join(x = Device, y = Typology, by="INSEE_COM")

tab <- table(fortest$Typo,fortest$ABC_Zones)

test<-chisq.test(fortest$Typo,fortest$ABC_Zones)
test


    Pearson's Chi-squared test

data:  fortest$Typo and fortest$ABC_Zones
X-squared = 158.24, df = 4, p-value < 2.2e-16

Code

vcd::mosaic(~ ABC_Zones + Typo, data = fortest, shade = TRUE, set_labels=list(Typo = short.labels3), main ="2003*", rot_labels=c(45,0,0,0), just_labels = c("left"), varnames=FALSE)
# 
P_Zone_2003 <- test$p.value

Figure 18: Mosaic plots: links between ABC Zoning and Risk / Renew / Reproduce trajectories 2003 (**Figure 7 in the published version**)

In 2003, we compare ABC Zoning with 2004 clusters.

Code

## Selecting and merging variables
Device <- ExportABC[,c("CODGEO_2008","ZONE_ABC_2008")]
Device <- rename(Device, INSEE_COM = CODGEO_2008)
Device <- rename(Device, ABC_Zones = ZONE_ABC_2008)

Typology <- com[,c("INSEE_COM","clus3_2008")]
Typology <- rename(Typology, "Typo" = "clus3_2008")

fortest <- left_join(x = Device, y = Typology, by="INSEE_COM")

tab <- table(fortest$Typo,fortest$ABC_Zones)

test<-chisq.test(fortest$Typo,fortest$ABC_Zones)
test


    Pearson's Chi-squared test

data:  fortest$Typo and fortest$ABC_Zones
X-squared = 211.01, df = 6, p-value < 2.2e-16

Code

mosaic(~ ABC_Zones + Typo, data = fortest, shade = TRUE, varnames = FALSE,
       set_labels=list(Typo = short.labels3), main ="2008", 
       rot_labels = c(45,0,0,0), just_labels = c("left"))
P_Zone_2008 <- test$p.value

Figure 19: Mosaic plots: links between ABC Zoning and Risk / Renew / Reproduce trajectories 2008 (**Figure 7 in the published version**)

Code

## Selecting and merging variables
Device <- ExportABC[,c("CODGEO_2012","ZONE_ABC_2012")]
Device <- rename(Device, INSEE_COM = CODGEO_2012)
Device <- rename(Device, ABC_Zones = ZONE_ABC_2012)

Typology <- com[,c("INSEE_COM","clus3_2012")]
Typology <- rename(Typology, "Typo" = "clus3_2012")

fortest <- left_join(x = Device, y = Typology, by="INSEE_COM")

tab <- table(fortest$Typo,fortest$ABC_Zones)

test<-chisq.test(fortest$Typo,fortest$ABC_Zones)
test


    Pearson's Chi-squared test

data:  fortest$Typo and fortest$ABC_Zones
X-squared = 355.32, df = 8, p-value < 2.2e-16

Code

vcd::mosaic(~ ABC_Zones + Typo, data = fortest, shade = TRUE, 
            set_labels = list(Typo = short.labels3), main ="2012", 
            rot_labels = c(45,0,0,0), just_labels = c("left"), varnames=FALSE)
P_Zone_2012 <- test$p.value

Figure 20: Mosaic plots: links between ABC Zoning and Risk / Renew / Reproduce trajectories 2012 (**Figure 7 in the published version**)

Code

## Selecting and merging variables
Device <- ExportABC[,c("CODGEO_2018","ZONE_ABC_2018")]
Device <- rename(Device, INSEE_COM = CODGEO_2018)
Device <- rename(Device, ABC_Zones = ZONE_ABC_2018)

Typology <- com[,c("INSEE_COM","clus3_2018")]
Typology <- rename(Typology, "Typo" = "clus3_2018")

fortest <- left_join(x = Device, y = Typology, by="INSEE_COM")

tab <- table(fortest$Typo,fortest$ABC_Zones)

test<-chisq.test(fortest$Typo,fortest$ABC_Zones)
test


    Pearson's Chi-squared test

data:  fortest$Typo and fortest$ABC_Zones
X-squared = 400.87, df = 8, p-value < 2.2e-16

Code

vcd::mosaic(~ ABC_Zones + Typo, data = fortest, shade = TRUE,
            set_labels=list(Typo = short.labels3), main ="2018", 
            rot_labels = c(45,0,0,0), just_labels = c("left"), varnames=FALSE)
P_Zone_2018 <- test$p.value

Figure 21: Mosaic plots: links between ABC Zoning and Risk / Renew / Reproduce trajectories 2018 (**Figure 7 in the published version**)

Findings

Mosaic plots display contingency tables as rectangles, and represent a breakdown of the risk/renew/reproduce typology by A/B/C policy zoning. Shades display Pearson residuals, a standardized measure of how far the observed frequency deviates from the expected frequency. In 2008, there is a strong overrepresentation of A zones in the Reproduce category ; in 2012, there is a strong underrepresentation of A areas in the Risk category.

The three main trajectories (risk, renew and reproduce) are highly correlated with the spatial definitions of national public policies according to ABC zoning, meaning that households wealth accumulation opportunities within market dynamics are highly determined by housing market regulations.

5 Cited places and cities

This interactive map shows the location of all cited places and municipalities in the published version. When clicking on a municipality, it is possible to see its trajectory over the time (clusters belonging).

Figure 22: Location of cited places and cities

6 Appendices

6.1 Sequence analysis of municipal profiles.

When preparing the paper, we used the Traminer package (Gabadinho et al. 2011) to analyze the output of the municipal typology over time using sequence analysis methodologies. We did not include this in the paper, but leave the method for reference, as it helped interpreting the different municipal trajectories. We relied on some of these graphs to write the narratives on the sequence distributions and local dynamics.

The first plot (top left, Figure 23) shows the states frequencies for each time unit. The transversal entropy plot (top right) displays the evolution over positions of the cross-sectional entropies. The entropy is 0 when all municipalities are characterized by the same category and is maximal when they are characterized by the same proportion of each category.The modal state is a state sequence object (containing a single state sequence) with additional attributes, among which the frequencies attribute containing the transversal frequency of each state in the sequence.

Code

##### 4.1 Put results together ####
toto <- df[c("INSEE_COM","year","clus10")]

clusters_by_year <- dcast(toto, INSEE_COM ~ year)

# Export clusters_by_year as an XLS file for further analysis

writexl::write_xlsx(clusters_by_year,"data/clusters_by_year.xlsx", col_names = TRUE, format_headers = TRUE)

##### Text for nterpretation of results
mvad.alphabet <- c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")
mvad.labels <- c("1. Risk. Moderate risk in socially-mixed mature suburbs", 
                   "2. Renew. Suburban single-family homes as investments",
                   "3. Renew. New-build suburbs",
                   "4. Risk. Middle-class suburban single-family homes", 
                   "5. Risk. Working-class suburban single-family homes",
                   "6. Risk. Working-class inner-suburbs",
                   "7. Renew. Recycled wealth in suburbs", 
                   "8. Reproduce. Exclusive central and suburban markets",
                   "9. Reproduce. Late gentrified central districts and inner suburbs",
                   "10. Reproduce. Central Paris market")
mvad.scodes <- c("1 Risk", "2 Renew", "3 Renew", "4 Risk", "5 Risk", "6 Risk", "7 Renew", "8 Reproduce", "9 Reproduce", "10 Reproduce")

mvad.seq <- seqdef(clusters_by_year, 2:9, cpal = cahPal10, alphabet = mvad.alphabet, 
                   states = mvad.scodes, labels = mvad.labels, xtstep = 1)


par(mfrow = c(2, 2), mar = c(2,2,2,2))
seqdplot(mvad.seq, title = "a. Freq of states", withlegend = FALSE, border = NA)
seqHtplot(mvad.seq, title = "b. Entropy")
#seqmsplot(mvad.seq,  title = "c. Mean time", withlegend = FALSE, border = NA)
seqmtplot(mvad.seq, title = "c. Modal states", cex.axis = .7, withlegend = FALSE)
seqlegend(mvad.seq, cex = 0.6)

Figure 23: Frequencies, entropy and modal states of sequences

Code

## First we plot the 20 most common sequences. Identitcal sequences will show up as common patterns
seqfplot(mvad.seq, tlim=1:50,  with.legend = FALSE)

seq_str <- seqconc(mvad.seq)

Code

seqIplot(mvad.seq, sortv = "from.start", with.legend = FALSE,  cex.legend=.6, 
         title = "State sequences (all)")

Figure 25: An exploratory typology of cluster sequences with Traminer, to extract the most common patterns

Code

selected_row <- clusters_by_year[clusters_by_year$INSEE_COM == "69086", ]

single.seq <- seqdef(selected_row, 2:9, cpal = cahPal10, alphabet = mvad.alphabet, states = mvad.scodes, labels = mvad.labels, xtstep = 1)

seqiplot(single.seq, with.legend = FALSE)

Findings

Our findings indicate that the presumed ideal-typical trajectory of asset valorization is far more the exception than the norm of most residential markets in our sample. The few cases that actually follow this pattern were all located in the Paris region, switching from investment risk (Cluster 1, 4 5 or 6) to reproduction over time. Examples include places like Saint-Maurice (wedged between the Seine and Paris’ Bois de Vincennes), Villejuif (a suburb at the southernmost terminus of the Paris Metro system), Maison-Alfort (an eastern suburban corridor, 8 km from Paris), and Pontoise (a New Town development North-West of Paris). Acquisitions made in these areas at the start of the 2000s would have occurred in the context of the most average profile of buying and selling in the country; these markets quickly moved away from risk profiles and developed features of attractive, in-demand destinations for buyers and sellers alike. Outside Paris, however, Lyon and Avignon showed a consistent pattern of renew-based market activity; changes between trajectories tended to oscillate between risk and renew clusters with almost no change into or out of reproduced wealth profiles.

Overall, we can say that as opposed to consistency and stability, the residential markets in France display a considerable amount of diverse and contrasting valorization pathways that are geographically clustered by region. The share of municipalities characterized by risk represent the lion’s share of our sample (Figure 23: this proportion fell from 60.5% to 43.8% of residential activity between 2002 and 2012, but increased again to 52.8% by 2018. Municipalities characterized by the consolidation and reproduction of residential wealth more than doubled from just 9.1% of the sample in 2002 to 22.7% by 2018. This division of the market into accumulated wealth and vulnerability is all the more striking as it occurred precisely at a time when legislation was expanding to improve the quality and affordability of housing throughout the country. Wealth deterioration has been especially acute in suburbs built between the 1970s and 1990s, where ageing, working-class populations are concentrated (Lévy, 2005). Again, although such municipalities may offer more affordable entry-level options for homeowners, buying in these areas typically implies a heavier debt burden with low, if not declining, property values vis-à-vis other markets. In fact, depreciation in some areas has been significant enough to shift into an entirely new residential trajectory. Peripheral markets outside Lyon and Avignon are a good example of this trend where patterns of renewal-based activity before 2008 progressively depreciated into risk-laden profiles.

6.2 Refracting stratification across municipalities

Thus far, our analysis has focused on interpreting trajectories of household wealth and vulnerability derived from place-specific characteristics of residential transactions. To further generalize these findings, however, we scale-up our analysis to assess the degree to which financial indicators are linked to the articulation of asset-based risk, renewal and reproduction. So rather than comparing trends in market activity between municipalities, we examine these tendencies across the entire dataset, regardless of where the transactions took place. In short, we return to a more assertive notion of class in the study housing welfare and inequality (Mckee and Muir 2013). We use a graphical matrix to gauge how and to what extent the financial structure of acquisitions – namely purchasing power and indebtedness – are linked to trajectories of household risk, renewal and reproduction. Figure 26 and Figure 27 show three columns for each housing regime, which are broken down by year on the x-axis (2002-2018). The left-hand vertical axis demarcates property-values subdivided by highest and lowest price quartiles; the right-hand vertical axis depicts the highest and lowest income quartiles. First, with regard to purchasing power, Figure 26, shows that the price-to-income ratio increased between 2002 and 2006 across every market regime. The higher the price bracket of property in a municipality (left y-axis), the greater the financial effort needed by households to acquire it. Over time, price increasingly decouples from income, especially in the highest-priced markets, where property values represent more than 3 months/m2. For each income bracket (right y-axis), housing prices and price-to-income ratios are consistently higher in “reproduce” markets than in other areas. The higher the income bracket of buyers, the higher the price-to-income ratio of transactions. In fact, many blank spots appear in the lower-priced brackets of the “reproduce” column in the matrix. This is because low-priced properties simply do not exist in this regime category.

Code

# A function to plot indicators all together 
plotInd <- function(x, fill, label){
  p <- ggplot(x, aes(.data[["year"]], .data[["price_dec"]], fill = .data[[fill]])) +
    geom_tile() + 
    scale_fill_viridis(name = label, direction = -1, option = "C") +
    facet_grid(fct_rev(Med_Income_dec)~clus3,labeller = labeller(clus3 = cluster_names)) + 
    scale_y_continuous(breaks = c(1,2,3,4), 
                       labels = c("Lower prices Q1", "Q2", "Q3","Higher prices Q4"),
                       name = "Price quartile") + 
    theme_minimal(base_size = 8) +
    theme(legend.position = "bottom", 
          plot.title = element_text(size = 14, hjust = 0), 
          axis.text.y = element_text(size = 6), 
          axis.text = element_text(size = 7),
          axis.ticks = element_blank(),
        strip.background = element_rect(colour="white"), 
        legend.title = element_text(size = 8), legend.text = element_text(size = 6)) +
    removeGrid()
  return(p)
}

plotInd(x = synth, fill = "PriceIncomeRatio", label = "Price to income ratio")

Figure 26: Class stratification as a function of purchasing power

Code

plotInd(x = synth, fill = "Debt2Price", label = "Debt to price ratio")

Figure 27: Class stratification as a function of indebtedness

Findings

First, with regard to purchasing power, Figure 26, shows that the price-to-income ratio increased between 2002 and 2006 across every market regime. The higher the price bracket of property in a municipality (left y-axis), the greater the financial effort needed by households to acquire it. Over time, price increasingly decouples from income, especially in the highest-priced markets, where property values represent more than 3 months/m2. For each income bracket (right y-axis), housing prices and price-to-income ratios are consistently higher in “reproduce” markets than in other areas. The higher the income bracket of buyers, the higher the price-to-income ratio of transactions. In fact, many blank spots appear in the lower-priced brackets of the “reproduce” column in the matrix. This is because low-priced properties simply do not exist in this regime category.

Household indebtedness (Figure 27) also increased over the 2000s across all three market regimes, with the exception of the period between 2010-12 when global interest rates dropped to historic lows in the years following the Great Recession. After 2015, the situation again worsened for all income brackets. Consistent with our previous findings, the entire risk segment of our sample exhibited a pronounced debt-burden, while debt-to-price ratios progressively decreased in renew-based markets, where home equity is rechannelled to lower the deposit and/or leverage needed for investment in new-build districts and gentrifying neighborhoods.

!Link to supplemental analysis

Code

sessionInfo()

R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/Paris
tzcode source: internal

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] vcd_1.4-13           ggExtra_0.10.1       viridis_0.6.5       
 [4] viridisLite_0.4.2    ggthemes_5.1.0       ggalluvial_0.12.5   
 [7] leaflet.extras_2.0.1 leaflet_2.2.2        slickR_0.6.0        
[10] maplegend_0.1.0      mapsf_0.12.0         kableExtra_1.4.0    
[13] cluster_2.1.6        TraMineR_2.2-10      fastcluster_1.2.6   
[16] factoextra_1.0.7     FactoMineR_2.11      readxl_1.4.3        
[19] lubridate_1.9.4      forcats_1.0.0        stringr_1.5.1       
[22] purrr_1.0.2          readr_2.1.5          tidyr_1.3.1         
[25] tibble_3.3.0         ggplot2_3.5.1        tidyverse_2.0.0     
[28] reshape2_1.4.4       dplyr_1.1.4          sf_1.0-19           

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.1-3      rstudioapi_0.17.1       jsonlite_1.8.9         
  [4] wk_0.9.4                magrittr_2.0.3          estimability_1.5.1     
  [7] farver_2.1.2            rmarkdown_2.28          vctrs_0.6.5            
 [10] base64enc_0.1-3         rstatix_0.7.2           htmltools_0.5.8.1      
 [13] broom_1.0.7             cellranger_1.1.0        s2_1.1.7               
 [16] Formula_1.2-5           KernSmooth_2.23-24      htmlwidgets_1.6.4      
 [19] plyr_1.8.9              emmeans_1.10.5          zoo_1.8-12             
 [22] mime_0.12               lifecycle_1.0.4         pkgconfig_2.0.3        
 [25] Matrix_1.6-5            R6_2.6.1                fastmap_1.2.0          
 [28] shiny_1.9.1             digest_0.6.37           colorspace_2.1-1       
 [31] crosstalk_1.2.1         ggpubr_0.6.0            vegan_2.6-8            
 [34] labeling_0.4.3          timechange_0.3.0        abind_1.4-8            
 [37] mgcv_1.9-1              compiler_4.3.1          proxy_0.4-27           
 [40] bit64_4.6.0-1           withr_3.0.2             backports_1.5.0        
 [43] carData_3.0-5           DBI_1.2.3               psych_2.4.6.26         
 [46] ggsignif_0.6.4          MASS_7.3-60.0.1         classInt_0.4-10        
 [49] scatterplot3d_0.3-44    permute_0.9-7           flashClust_1.01-2      
 [52] tools_4.3.1             units_0.8-5             lmtest_0.9-40          
 [55] httpuv_1.6.15           glue_1.8.0              nlme_3.1-166           
 [58] promises_1.3.0          checkmate_2.3.2         generics_0.1.4         
 [61] leaflet.providers_2.0.0 gtable_0.3.6            tzdb_0.4.0             
 [64] class_7.3-22            hms_1.1.3               xml2_1.3.6             
 [67] car_3.1-3               ggrepel_0.9.6           pillar_1.11.0          
 [70] vroom_1.6.5             later_1.3.2             splines_4.3.1          
 [73] lattice_0.22-6          bit_4.6.0               tidyselect_1.2.1       
 [76] miniUI_0.1.1.1          knitr_1.48              gridExtra_2.3          
 [79] svglite_2.1.3           xfun_0.49               DT_0.33                
 [82] stringi_1.8.4           yaml_2.3.10             boot_1.3-31            
 [85] evaluate_1.0.1          multcompView_0.1-10     cli_3.6.5              
 [88] xtable_1.8-4            systemfonts_1.1.0       jquerylib_0.1.4        
 [91] munsell_0.5.1           Rcpp_1.0.13-1           coda_0.19-4.1          
 [94] parallel_4.3.1          leaps_3.2               mvtnorm_1.3-1          
 [97] scales_1.3.0            e1071_1.7-16            writexl_1.5.4          
[100] crayon_1.5.3            rlang_1.1.6             cowplot_1.1.3          
[103] mnormt_2.1.1

7 References

Casanova, L., G. Boulay, Y. Gérard, and L. Yahi. 2017. “Deux bases de données, aucune référence de prix. Comment observer les prix immobiliers en France avec Dvf et Perval ?” Revue d’Économie Régionale Et Urbaine Octobre (4): 711–32. https://doi.org/10.3917/reru.174.0711.

Gabadinho, A., G. Ritschard, N. S. Müller, and M. Studer. 2011. “Analyzing and visualizing state sequences in R with TraMineR.” Journal of Statistical Software 40 (4): 1–37.

Le Brun, P. 2022. “Un soutien géographiquement inégal : la sélectivité spatiale des aides publiques à l’investissement immobilier résidentiel des ménages en France.” Géographie, économie, Société 24 (1): 43–68. https://doi.org/10.3166/ges.2022.0002.

Le Goix, R., R. Ysebaert, T. Giraud, M. Lieury, G. Boulay, M. Coulon, S. Rey-Coyrehourcq, et al. 2021. “Unequal housing affordability across European cities. The ESPON Housing Database, Insights on Affordability in Selected Cities in Europe .” Cybergeo : European Journal of Geography Data papers (974). https://doi.org/10.4000/cybergeo.36478.

Mckee, K., and J. Muir. 2013. “An Introduction to the Special Issue – Housing in Hard Times: Marginality, Inequality and Class.” Housing, Theory and Society 30 (1): 1–9. https://doi.org/10.1080/14036096.2012.682817.

Müllner, D. 2013. “fastcluster: Fast Hierarchical, Agglomerative Clustering Routines for R and Python.” Journal of Statistical Software 53 (9): 18. http://www.jstatsoft.org/v53/i09.