ActivityFacilities

Activity locations in MATSim plans can either be described by a coordinate or by a facility. A facility object contains an id, x- and y-coordinates, at least one ActivityOption and (optional) user defined attributes. The ActivityOptions define which type of activities can be performed at a facility and can be assigned weights / capacities and opening hours.

Scope

Since agents are allowed to choose activities outside the study area, a buffer zone of 35km is created around the study area. For this buffer zone the facilities are generated the same way as for the study area in order to obtain more realistic travel diaries. The analysis in the report are calculated for all facilities including the buffer zone, if not specified otherwise.

Input data

The ActivityFacilities for the study region are derived from OpenStreetMap (OSM) without input from other data sources, though this might be changed in the future. Nodes, ways and relations with the following tags are evaluated for the creation of ActivityFacilities:

  • amenity
  • shop
  • office
  • leisure
  • healthcare
  • tourism
  • historic
  • social_facility
  • building
  • land-use
  • addr:housenumber

Overview Processing

The generation of ActivityFacilities is organised in four steps.

  1. Configuration
  2. Parsing OSM data
  3. Preprocessing
  4. Facility creation

In the first step, a special config is created. This config specifies, which OSM tags are considered relevant for the generation of ActivityFacilities, which activity option is available for which tag and what level of usage intensity is assigned to each activity option. In the second step, the relevant OSM-objects (Nodes, Ways, Relations) are collected. These OSM-objects are then used in the Preprocessor to generate semantic objects (Poi, Building, Landuse, HouseNumber). Then, a spatial matching between these semantic objects is performed to determine and a number of statistics are calculated. Finally, the ActivityFacilities are created and assigned ActivityOptions and other relevant attributes.

Configuration

The Configuration defines which OSM tags are relevant for the generation of ActivityFacilities. The tag filters differentiate between POIs, buildings and land-use areas, as can be seen in the examples in Figures 1 to 3. Additionally, the config specifies for each tag, which activity types are permitted at the corresponding ActivityFacilities and which intensity level is assumed for each activity type.

Facility_Config_Amenity

Figure 1: Excerpt FacilityConverterConfig amenities

Facility_Config_Building

Figure 2: Excerpt FacilityConverterConfig buildings

Facility_Config_Landuse

Figure 3: Excerpt FacilityConverterConfig land-use areas

Finally, the Configuration defines the numerical values of the weights per floor area [1/m21/m^{2}] by activity type and intensity level (see Section Weights).

Parsing OSM data

Then, the raw data in the OSM file is parsed and converted into Node, Way and Relation objects and stored in suitable data containers. The OSM objects are filtered according to the definitions in the config file as well as spatially to include only objects that contain at least one node within the study region.

Preprocessing

In the next step, the nodes, ways and relation are processed and objects for POIs, buildings, land-use areas and house numbers are created. A spatial matching between these objects is performed to determine e.g. which house numbers and which buildings belong together. Then, several statistics are calculated that are later on used to generate ActivityFacilities and to impute attributes for ActivityFacilities and ActivityOptions. The first of these imputations — for the number of floor levels per building — concludes the preprocessing.

Pois, Buildings, Landuses, HouseNumbers

Poi objects can be generated from nodes, ways or relations whereas Building and Landuse objects are generated from ways and relations and HouseNumbers that are not already tagged to a POI or building are generated from nodes. Each of these objects contains a Geometry and a map of all OSM tags.

A relation can contain several nodes, ways and even other relations as can be seen for example for the Berlin Tierpark relation in Figure 4.

Facility_Relation_Tierpark

Figure 4: Example for a Relation with Several Elements

Within a relation, ways describing boundaries are often tagged as “outer” or “inner” polygons as can be seen in Figure 5 where the relation contains one outer boundary and two inner boundaries. Each boundary consists of one or more ways.

Facility_Relation_Multipolygon

Figure 5: Example for a Relation with "Outer" and "Inner" Boundaries

Therefore, when processing relations with more than one way, the ways are first assembled into closed polygons and then sorted into inner and outer polygons. Subsequently, using spatial matching, related inner and outer polygons are found. For each outer polygon, a multi-polygon is created that also contains all associated inner polygons. Finally, a Poi, Building or Landuse object is generated for each multipolygon with the OSM tags of the relation assigned to all of them.

Spatial matching

The following spatial matchings are carried out:

  • Pois and Buildings (n:n)
  • HouseNumbers and Buildings (n:1)
  • Pois and Landuses (n:1)
  • Buildings and Landuses (n:1)

Note, that while several house numbers can be matched to one building, each house number can only be matched to one building. The same is true for POIs or buildings and land-use areas, whereas POIs and buildings are an n:n matching, i.e. one POI can be matched to several buildings but several POIs can also be matched to the same building. With regard to partially overlapping objects, a spatial matching is considered successful if the overlap of the smaller polygon with the larger polygon contains at least 50% of the smaller polygon’s area.

Study region statistics

The main purpose of the statistics derived in this step is to serve as input for the imputation of ActivityFacility and ActivityOption attributes. Since the characteristics described by the statistics can differ substantially between study regions, they are re-calculated for each study region and an integral part of the facility generation process. Table 1 gives an overview of the statistics and their usage in this application.

Statistic Usage
Distribution of the number of levels in a building by building type Floor level imputation
Average floor level height by building type Floor level imputation
Quantiles of building density by land-use type Facility creation (land-use without POIs or buildings)
Quantiles of building footprints by activity type Facility creation (facilities without polygons)
90% quantile for building footprints of single family homes Weight calculation home locations

Table 1: Calculated Statistics and their Usage

Floor Level Imputation

The number of floor levels in a building is later on used to calculate the weight of multifamily home facilities. For buildings that are not tagged with the “building:levels”-tag, the number of floor levels is imputed in a two-step process. If information about the building’s height is given in the OSM data, the number of floor levels is determined by dividing the building height by the average floor height - as calculated in the study region statistics - and rounded down to the next integer. Thereby, a differentiation is made between the average floor height of buildings with the tag “apartment” and other residential buildings.

Otherwise, the number of floor levels is randomly drawn from the distribution of the number of levels in a building, that was calculated for the study region (see Section Statistics). Three different building types are taken into account: apartment buildings, other residential buildings and other buildings. The maximum number of levels is set to 10, since it is assumed that buildings with more levels are usually tagged accordingly in OSM.

Facility Creation

The Facility Creation is the part of the application, where the MATSim ActivityFacilities are created and their attributes are calculated and assigned. ActivityFacilities are created for

  • Pois (taking into account Buildings and HouseNumbers)
  • Buildings without Pois (taking into account HouseNumbers)
  • Landuse areas without Buildings or Pois

An ActivityFacility is always created for the smallest unit possible, as illustrated by the following examples

  1. A Building with more than one Poi → create an ActivityFacility for each Poi
  2. A Building with more than one Housenumber → create an ActivityFacility for each Housenumber
  3. A Poi with more than one Building → create an ActivityFacility for each Building

Buildings without Pois

An ActivityFacility is created for buildings without associated POIs if

a) the value of the “building”-tag is specified in the Configuration or
b) the value of the “building”-tag is “yes”, the associated land-use area is tagged with a type specified in the Configuration and the building footprint is higher than 50 m² for residential buildings or higher than 100 m² for non-residential buildings.

Landuse with neither Pois nor Buildings

There are cases in OSM where a land-use area is defined but without the specification of any buildings or POIs within that land-use area. If a such a land-use area is tagged with one of the land-use tags specified in the Configuration, ActivityFacilities are created for the Landuse object. For each activity type specified in the Configuration for the land-use type, the number of ActivityFacilities is determined by drawing a random number from an interval of permissible building densities and multiplying this number with the area of the Landuse polygon. The permissible building density interval is determined by the 25 and 75 percentile of the building density distribution for the corresponding land-use type. The precise coordinate for the ActivityFacility is chosen randomly within the Landuse polygon.

ActivityFacility attributes

The following attributes are assigned to the ActivityFacilities:

  • area (mandatory)
  • ActivityOptions (mandatory)
  • name (optional)
  • street (optional)
  • access (optional)
  • relevantTag (optional)
  • relevantTagValue (optional)

The optional attributes are only assigned if a corresponding tag is set for the POI or building in OSM. Missing values are not imputed.

Area

The calculation of the area attribute depends on the source of the ActivityFacility as described by the following rules:

  • For a Poi derived from a node:
    • If the Poi lies within a building: divide the area of the building by the number of Pois within the building.
    • If the Poi does not lie within a building: randomly draw an area from the distribution of building footprints by main activity type (see Section Study region statistics).
  • For a Poi derived from a way:
    • Use the area of the Poi.
    • Exception: For malls the area of the mall is divided by all Pois regardless of whether they are ways or nodes to maintain consistency with distributing the non-shop areas of a mall among the shops.
  • For a Building without Pois:
    • With associated Housenumbers: Divide the building area by number of Housenumbers within the building.
    • Without associated Housenumbers: use the area of the building.
  • For an OsmLanduse without Pois or Buildings:
    • Randomly draw an area from the distribution of building footprints by activity type (see Section Study region statistics).

ActivityOption

Each ActivityFacility is assigned at least one ActivityOption. Each ActivityOption has a (mandatory) type, a weight/capacity and may also have opening hours. The respective activity types are defined in the Configuration for each POI, building and land-use area. The weights and opening hours are determined as described below.

Weights

The weight of an ActivityFacility / ActivityOption represents its attractiveness relative to other ActivityFacility / ActivityOptions of the same type. It is currently stored in the capacity attribute of the ActivityOption. A different approach is used for calculating the weight for home and all other activity types.

Weight for home ActivityOptions

The weight measure for home ActivityOptions approximates the number of households living in the ActivityFacility. Thus, the first step is to differentiate between single- and multifamily homes. A home facility is considered a single-family home if it is a cabin, houseboat, farm, caravan or bungalow or if the building is tagged as YES, detached, house, residential, semidetached_house or terrace and its area is smaller than the threshold for building footprints of single-family homes described in Section Study region statistics. If a home facility is considered a single-family home, the weight for the ActivityOption is set to 1.

For multifamily homes, the weight is calculated with the following formula:

weightMFH=numberOfFloorLevelsaverageNumberOfFlatsPerFloorweightMFH = numberOfFloorLevels * averageNumberOfFlatsPerFloor

Since there is no empirical data currently available regarding the average number of flats per floor, the value of this parameter is set based on expert assessment to 2.5.

Weight other ActivityOptions

The weight for all other ActivityOption types is calculation based on the floor area of the ActivityFacility and a weight factor per floor area. The weight factor per floor area depends on the relevant OSM tag and the activity type of the ActivityOption. It can be interpreted as a measure of usage intensity. Since there is no empirical data for the usage intensity of all the possible facilities, a lot of assumptions have to be made. To simplify this, three intensity levels (LOW, MEDIUM, HIGH) are used and for each activity type, the associated OSM tags are assigned an intensity level in the Configuration. Examples for this assignment are shown in Table 2 and in Figure 6:

LOW MEDIUM HIGH
Work Industrial, Religion Shop, School Office, Government
Education University School Kindergarten
Shopping Garden centre Clothes, Supermarket Kiosk, Bakery
Leisure Allotments, Golf course Museum, cinema Cafe, Fast food
Errand Animal shelter, car repair Physiotherapist, Hair dresser Doctors practice, Post office

Table 2: Examples for Intensity Level Classifications of Different Relevant OSM Tags

Facility_Config_IntensityLevels

Figure 6: Example for the Definition of Intensity Levels per Activity Type in the Config

The assumptions for the weight factor per floor area for each activity type and intensity level are shown in Table 3. Note, that only the relative difference between ActivityOptions of the same activity type are relevant, thus the factor for the medium intensity level is always set to 1.0.

LOW MEDIUM HIGH
Work 0.30 1.00 1.50
Education 0.50 1.00 1.25
Shopping 0.15 1.00 2.00
Leisure 0.10 1.00 3.00
Errand 0.20 1.00 2.50

Table 3: Assumptions for Weight Factor per Floor Area [$1/m^{2}$] by Activity Type and Intensity Level

Opening hours

To process the opening hours, the Opening Hours Parser by Simon Poole is used. The results are then processed to fit MATSim opening hour requirements:

  1. In OSM, it is possible to store different sets of opening hours for different weekdays, public holidays, school holidays etc.
    For this application, only one set of opening hours are stored. To be representative of an average weekday, the opening hours for Tuesday are chosen.
  2. The original parser — as of April 2025 — does not handle inconsistencies in the start and end times, such as the either value being smaller than zero.
    In addition, opening hours that go beyond midnight need to be recoded to a more than 24h time frame, e.g. 20:00-02:00 being recoded to 20:00-26:00. This is corrected in Creario afterwards.

Results Berlin example

The following figures and tables show the results for Berlin.

Facilities_Results_Berlin

Figure 7: ActivityFacilities Berlin by Activity Type

Number of Objects Number of Facilities
Poi 80’700 92’667
Building 527’018 409’232
Landuse 35’824 1’626

Table 4: Statistics of Generated ActivityFacilities per Semantic Object Type

Number of Facilities
Home 392’465
Work 78’500
Education 7’391
Shopping 14’029
Leisure 50’411
Errand 14’944
Total 503’344

Table 5: Number of ActivityFacilities per Activity Type

Land-use type Average 5perc 10perc 25perc Median 75perc 90perc 95perc
Commercial 857 119 159 275 546 1038 1922 2746
Education 311 118 134 163 207 497 786 1113
Industrial 744 83 117 214 406 815 1513 2182
Residential 1479 287 438 827 1339 1923 2553 2996
Retail 758 104 146 232 478 967 1861 2361

Table 6: Building Densities [buildings / km²] by Land-Use Type

Facilities_Results_BuildingLevels_Berlin

Figure 8: Number of Floor Levels by Building Type

Average floor level heights:
apartment buildings: 3.848 m
other residential buildings: 3.655 m

Building tag Average 5perc 10perc 25perc Median 75perc 90perc 95perc
Shop 1008 30 55 158 588 1443 3317 6736
Education 3440 138 236 452 980 3250 9565 15746
Office 1191 89 137 270 572 1150 2253 3574
Leisure 5433 18 42 138 513 1791 6927 13359
Commercial 818 38 63 150 359 857 1749 2811
Industrial 1370 37 64 156 450 1249 3358 5918
Religious 550 104 147 254 456 759 1077 1288

Table 7: Floor Area [m²] of Buildings by Building Tag

Average Minimum Maximum
Home 159 0.446 15’397
Work 648 0.001 1’589’907
Education 733 0.002 38’623
Shopping 544 0.064 36’719
Leisure 1’189 0.205 1’589’907
Errand 488 0.001 23’494
Other 322 0.001 1’589’907

Table 8: Floor Area [m²] of ActivityFacilities by Activity Type