City generation from real datasets

Jasper Balster – 500862136

Procedural city generation, to generate a city which in theory (looks like it) could exist is nothing new and has been developed multiple times over the years. Just have a look at this article1 where techniques like perlin noise, L-Systems and tiling systems are given as examples for this task. However, as stated above, these techniques do result in road networks and buildings which in reality do not exist.

What if, instead of generating something completely new, one could take data from existing cities around the world and use this to combine various elements from these cities with each other to create a new city. This, of course, would lead to some interesting scenarios as for example the roads of Tokyo would never line up with the roads of New York. Some sacrifices do have to be made here. Another crucial element would be the aspect of the actual data. How could an entire city be loaded into a readable format which also incorporates speed when processing this data? These and other questions lay the foundation for this research article.

Table of contents

  1. Introduction and goals
  2. Pre-processing the data
    1. The data
    2. Python to the rescue
      1. Coordinate conversion
      2. Nodes and edge concatenation
      3. File exportation
  3. Over to Unity (Editor)!
    1. Custom editor window
  4. Loading, generating and stitching the roads
    1. Data loading
    2. Grid partitioning
    3. Detection of “end links”
    4. Connecting the “end links”
  5. Road mesh generation
    1. Edge mesh generation
    2. Intersection generation
  6. Building placement
    1. Plot generation
      1. OBB plots for loops
        1. Adaptation of the DCEL data structure
        2. Utilizing the OBB
      2. Single roads
    2. Filling plots
  7. Environment generation
  8. Conclusion
    1. Answering research questions
    2. The main goal
  9. Future prospects
  10. Sources

1. Introduction and goals

Originally, the idea for this research project was somewhat different to the eventual taken path. In the original concept, a neural network would be utilized to generate city layouts based on provided pre-existing visual / aerial datasets. However, during the initial desk research phase a data source was discovered which contains vector based data on over 80 cities around the world2 3. This immediately spiked my interest and I soon realized the usage of these datasets might bring the eventual end goal a little more within scope of a four-week project.

It would be unfair to describe the end goal as the singular goal of this research as the eventual result more or less resembles a sort of pipeline of stages. Nevertheless, an all-encompassing end goal has been drafted and could be described as the following:

“To procedurally generate a city road layout which integrates various elements directly taken from a selection of pre-existing datasets based on actual real-world cities”

As stated above, this goal can be split up in a total of six different sub-goals (questions) with each their own respective part within a grander pipeline. These are stated down below. Their numeric values correspond to the same values present within the pipeline below these sub-goals and will also be used throughout the article whenever applicable.

  1. “In what way can the acquired datasets be converted into a single readable format which allows for quick indexing and retrieval of a selection of nodes / edges?”
  2. “How can the converted data easily and quickly be loaded into Unity while customization of the eventual end result is still widely possible after this is completed?”
  3. “In what way can the data be easily partitioned / placed on a predetermined sized grid?”
  4. “How can selections of nodes and edges belonging to different cities which potentially do not share the same internal structure be connected to each other while retaining a form of realism and plausibility?”
  5. “In which manner could road meshes be generated from the resulting nodes and edges which insure a smooth transition between different kinds of roads which inherently retain different characteristics?”
  6. “How can buildings be placed in available spaces (lots) in between the generated roads?”
Constructed pipeline overview of the entire project

2. Pre-processing the data

Before anything can be done with the datasets in terms of nodes and edges, it is first quite important to understand how the data is structured and in which way it can best be interpreted to eventually result in an actual readable format.

2.1 The data

Inherently the acquired data consists of a structure called a “shapefile”. Despite the name, many shapefiles come with a variety of other supporting files which also contain data necessary for reading the entirety of the file4. This was in fact the case for the retrieved datasets as can be seen in the image.

Example of the New York dataset

Now there were a few options available when it comes to loading in the data from these shapefiles. The most straightforward ones being using one of the many available c# libraries which can handle this type of file format. However, this option has a few drawbacks which in fact resulted in another approach. For one, all 17 input files need to be present at all times to load a particular city in. This is not very convenient. Secondly, the data needs to constantly be loaded from this proprietary type of data format which means it is impossible to format the way the data is saved and / or apply any conversions (this is something quite important performance-wise as will be discussed later on). Lastly, it would be impossible to separate data conversion from the actual usage. Even though this last aspect is more of a personal preference, it does come with its advantages. The solution to this was to completely separate the data conversion and parsing process from the actual usage. This way the data can be pre-optimized for when it will be loaded into the actual tool.

2.2 Python to the rescue

When thinking of a quick to run, lightweight, self-contained and portable language which will surely have a library available for (almost) anything you can desire Python almost immediately comes to mind. And sure enough, there is a widely used package called “geopandas”5 available which is an extension for the underlying “pandas” data structure providing package.

Before the script will be discussed, it is first important to note what this script needs to achieve in order to be successful. As a baseline it needs to do these three things:

  1. Convert the internal coordinates into a format relative to the most central lying node (this way all datasets will have equal coordinate systems)
  2. Concatenate nodes and edges into a format in which each node parents all edges originating from that node.
  3. Output the data onto a single file format easily readable by c# or available libraries.

Each of these three functions will be thoroughly described in the following sub-chapters.

2.2.1 Coordinate conversion

One of the first things that was noticeable about the datasets are the coordinates used between different cities. These differ from city to city and thus for any combinations to be made, first need to be normalized. For this the approach taken was to take the center-most point and convert all coordinates relative to this point. An example can be seen to the right.

Within the dataset, coordinates are presented in a plain-text format which means they need to be extracted from this text. For this a custom regular expression was written which quickly retrieves both coordinates from such a string.

def parsepoint_nodes(point):
    reExp ="^POINT\s\((\d+\.\d+)\s(\d+\.\d+)\)$", point)
    return (float(, float(

The conversion was done by traversing twice over all nodes within the dataset. The first iteration determines the center node while the second iteration converts all remaining nodes to the new coordinate system. During the second iteration, all unnecessary data was also removed.

    for row in csvreader:
        if (index % 3 == 0):
            print(str(round(index / row_count * 100, 2)) + "% complete            \r", end="")

        geo_x, geo_y = parsepoint_nodes(row["geometry"])

        # checking bounds
        adjgeo_x = geo_x - middlenode[0]
        adjgeo_y = geo_y - middlenode[1]

        # removing some elements that are not needed
        del row["geometry"]
        del row["FID_Select"]
        del row["osm_id"]
        del row["ref"]
        del row["Class"]
        del row["ORIG_FID"]

        # export list saving
        row["ID"] = row.pop("")
        row["geometry_x"] = adjgeo_x
        row["geometry_y"] = adjgeo_y
        exportlist[geo_x + geo_y] = row  # using sum of geo_x and geo_y as arbitrary key for links to quickly find starting point later

In order to later on also quickly link de edges to their corresponding node parents the data is added to a dictionary with as key the sum of the new x and y coordinates. As edges have the exact same coordinates for their starting points as nodes this allows for an easy aggregation.

2.2.2 Nodes and edge concatenation

Now that the nodes have been loaded and converted, it is time to add the edges to their corresponding parent nodes. This is done by iterating over each of the edges, converting their starting point coordinates and checking if the dictionary contains a key made up from the sum of the new x and y coordinates. Even though all edges should easily find their corresponding parent node, it is important to note that key errors do in fact occur in some of the datasets. This proves that the data is not infallible.

    for row in csvreader:
        if (index % 3 == 0):
            print(str(round(index / row_count * 100, 2)) + "% complete            \r", end="")

        # parsing the points found in the geometry table
        firstpoint, parsedpoints = parsepoints_links(row["geometry"], middlenode)

        # looping through the parsed points and adding them to the exportlist
        key = firstpoint[0] + firstpoint[1]
        for i in range(10):
                if(exportlist[key]["linksgeo" + str(i+1)] == ""):
                    exportlist[key]["linksgeo" + str(i+1)] = parsedpoints
            except KeyError:
                keyerrors += 1
                print("Keyerror encounter number " + str(keyerrors))

2.2.3 File exportation

Finally, the data needs to be exported into a format which can easily be loaded into Unity later on. For this, both JSON and CSV were considered as both are common formats found in every popular language. As you may have noticed some of the aforementioned for loops contain the iterable type csvreader which is, as the name suggests, used for reading csv files. This is because originally CSV was chosen to export the data to. However, when it was determined that nodes would represent underlying edges within the dataset JSON was favored over CSV due to its natural tree like structure.

The main issue you are facing when determining the export format is how to make it as easy and as fast as possible to read for when it needs to be imported. For this, a partitioning solution using the X and Y coordinates as separate layers of arrays was thought out. The data will be formatted into a dictionary where a rounded X value is the first layer key and a rounded Y value is the second layer key.

It works in the following way:

  1. First, the X value of a node is rounded down to a hundred. If the key does not yet exist, create it.
  2. Then, the Y value of a node is rounded down to a hundred. If the key does not yet exist, create it.
  3. The node is added to an array within this Y key.
Example of the dictionary contents at the given X and Y keys

After the data has been formatted it will be exported into JSON format using Python’s built in json library. The result is a singular JSON file of a couple MB depending on the size of the parsed city (e.g. new york as one of the largest cities is ~300MB).

Example of two exported JSON files

3. Over to Unity (editor)!

The next step is to take the resulting file and load it into Unity for usage. As can be seen in the discussed pipeline, the Unity segments are divided into a total of five stages which each lead to a partial result. To achieve this a custom editor window was created for the user to customize the result and walk through these stages.

3.1 Custom editor window

When loaded into Unity with the tool opened the following screen will be visible. A short description per option is also provided. A more thorough explanation of these options will be provided for any option later discussed.

Overview of the available settings in the editor window

As can be seen, each option has been assigned a designation. These designations will be used in this article later on to reference certain options and configurations as these of course effect the nature of some of the operations.

4. Loading, generating and stitching the roads

This chapter covers both stage 1 and 2 of the constructed pipeline in that it covers both loading of the data and generation of the roads. However, internally this process can be subdivided into a total of four different stages which will all be discussed over the following sub-chapters:

  1. Loading of the data
  2. Partitioning of the nodes on a grid
  3. Detecting so-called “end links”
  4. Connecting these “end links”

4.1 Data loading

Now that the data is formatted in such a way it can be easily loaded it was crucial to determine the way in which it will be loaded. The first option which comes to mind is to load the JSON file using Unity’s built in Json library. This, however was not an option as it only allows for deserialization into objects which is not suitable for the generated dataset as it needs a dictionary6.

One of the third-party options needed to be chosen. Between a few options available the choice was made to utilize Newtonsoft’s library7. This library allows for both serialization and deserialization of JSON and also has support for partial deserialization which was one of the deciding factors. However, after testing performance of partially loading a JSON file it turned out this was not a viable option when compared to just plainly loading the entire file into memory. For comparison, load times for New York and Tokyo combined lie around the 30 seconds mark. Just loading New York using partial loading techniques already takes up more than a minute.

Other than partial loading of the input files it was also tested if binary saving / loading of the resulting dictionary would be any quicker. This also turned out to be slower by a lot taking around 2 minutes to just load in New York. It was then decided to just stick to parsing the JSON file every time as it seemed to be the quickest option.

Deserialization is now as easy as executing a single line of code as can be seen in the following method.

public static Dictionary<intDictionary<intList<Dictionary<stringstring>>>> LoadNodesJson(string jsonFilePath)
    string json = File.ReadAllText(jsonFilePath);
    //Dictionary in the following format: [xcoord][ycoord][index][property] where x and y are rounded to the hundred
    return JsonConvert.DeserializeObject<Dictionary<intDictionary<intList<Dictionary<stringstring>>>>>(json);

4.2 Grid partitioning

Before any nodes can be placed within a partition on a grid, the grid itself needs to be generated. This can easily be done by dividing the specified dimensions for the city by the specified divider. This results in collection of the following class describing the partitions that were just created.

public class RectPartition
    public RectInt rectProperty;
    public List<Node> nodes;
    //This list works the following way: 
    //first int in tuple is index for the node. Second int in tuple is index for the linksgeo list contained in the node. Third int in tuple is for the linksgeo part inside the list
    public List<(intintint)> endLinksTop, endLinksLeft, endLinksRight, endLinksBottom;

Now it’s time to load the nodes onto each section of the grid. For this, there are two ways to proceed according to what the user selected for option (P.2). When the random placement option is selected, a new selection of nodes from a random city (which still has partitions left to allocate) will be loaded onto the next partition while there are still partitions left to fill. This differs from what happens when group placement is selected as this algorithm will first complete an entire selection of a city before moving on to the next one. This difference is visualized down below using a selection of 2 cities on a 5 x 5 grid where city 2 has one more partition than city 1.

Depicted difference between the two types of generation

Now, as described before the nodes are sorted by X and Y coordinates rounded down to the hundreds. This means the only thing left to do is to get a random range and loop through all segments in the data structure which may contain nodes which lie inside this range. For this, the method which can be seen below is constructed.

public static List<NodeGetNodesWithinRange((floatfloatxRange, (floatfloatyRangeDictionary<intDictionary<intList<Dictionary<stringstring>>>> loadedNodesDictionary<stringboolroadInclusionsbool resolveLinks)
    List<NodereturnList = new List<Node>();
    (intintroundedXBorders = ((int)Mathf.Floor((xRange.Item1 / 100)) * 100, (int)Mathf.Floor((xRange.Item2 / 100)) * 100);
    (intintroundedYBorders = ((int)Mathf.Floor((yRange.Item1 / 100)) * 100, (int)Mathf.Floor((yRange.Item2 / 100)) * 100);
    for (int xIndex = roundedXBorders.Item1; xIndex <= roundedXBorders.Item2; xIndex += 100)
        for (int yIndex = roundedYBorders.Item1; yIndex <= roundedYBorders.Item2; yIndex += 100)
                foreach (Dictionary<stringstringnodeElement in loadedNodes[xIndex][yIndex])
                    if (float.Parse(nodeElement["geometry_x"]) > xRange.Item1 && float.Parse(nodeElement["geometry_x"]) < xRange.Item2 &&
                        float.Parse(nodeElement["geometry_y"]) > yRange.Item1 && float.Parse(nodeElement["geometry_y"]) < yRange.Item2 &&
                        returnList.Add(new Node()
                            ID = int.Parse(nodeElement["ID"]),
                            name = nodeElement["name"],
                            type = nodeElement["type"],
                            oneway = int.Parse(nodeElement["oneway"]),
                            bridge = int.Parse(nodeElement["bridge"]),
                            tunnel = int.Parse(nodeElement["tunnel"]),
                            maxspeed = int.Parse(nodeElement["maxspeed"]),
                            geometry_x = float.Parse(nodeElement["geometry_x"]),
                            geometry_y = float.Parse(nodeElement["geometry_y"]),
                            linksGeos = DataFormatter.GetAllLinksFormattedJson(nodeElement)
            catch (KeyNotFoundException e)
    //Resolving linksgeos connected nodes by using a dictionary
    return returnList;

The following is a result of the generation depicted by debug lines. For this result the datasets of Delhi and Mumbai are used with a 13 / 12 division. On the left is random placement against group placement on the right.

Random placement result
Group placement result

Of course the images above are merely a result of the settings specified by the user. The following settings are primarily used to customize this result:

  • S1.1: City selection
  • S1.2: Range for node selection
  • S2.2: Vertical and horizontal partitions
  • S2.3: Min number of nodes per partition
  • S2.8: Roads to include in the generation

4.3 Detection of “end links”

The next step would be to detect so called “end links”. These are links (segments of an edge underneath a node) which cross a partition boundary into a partition which belongs to a different city entirely (which means this is avoided when a group placement is being executed). When this is the case two things need to happen:

  1. This edge needs to be reduced to the point where it no longer crosses this boundary.
  2. The last link or node before the boundary will be saved per that particular partition in a list.

This is what happens within the method which is constructed to detect these end links. The following code will be executed on each link contained within a segment until no more links are left or a boundary is crossed. The last segment of the code removes any links that are followed by a node which passes the border.

if((endLinksCollectSide.Contains(EndLinksCollectSide.All) || endLinksCollectSide.Contains(EndLinksCollectSide.Top)) &&
    nodes[i].linksGeos[j].links[k].> yRange.Item2) //TOP
    returnList[0].Add((ijk - 1));
    clearNext = true;
} else if((endLinksCollectSide.Contains(EndLinksCollectSide.All) || endLinksCollectSide.Contains(EndLinksCollectSide.Left)) && 
    nodes[i].linksGeos[j].links[k].< xRange.Item1) //LEFT
    returnList[1].Add((ijk - 1));
    clearNext = true;
} else if ((endLinksCollectSide.Contains(EndLinksCollectSide.All) || endLinksCollectSide.Contains(EndLinksCollectSide.Right)) && 
    nodes[i].linksGeos[j].links[k].> xRange.Item2) //RIGHT
    returnList[2].Add((ijk - 1));
    clearNext = true;
} else if ((endLinksCollectSide.Contains(EndLinksCollectSide.All) || endLinksCollectSide.Contains(EndLinksCollectSide.Bottom)) && 
    nodes[i].linksGeos[j].links[k].< yRange.Item1) //BOTTOM
    returnList[3].Add((ijk - 1));
    clearNext = true;
if (clearNext)
    nodes[i].linksGeos[j].links.RemoveAll(segment => nodes[i].linksGeos[j].links.IndexOf(segment>= k);
    hasEndLink = true;
Without edge link detection and removal
With edge link detection and removal

As can clearly be seen in the two examples above the segments crossing into other partitions are reduced to just being in front of the edge. This segment is now ready to be connected to neighbouring segments.

The following settings are primarily used to customize this result:

  • S2.6: The max distance of a node to an edge to be considered an end link

4.4 Connecting the “end links”

The final step within the road generation process is to connect the previously found end links. This is done by iterating over each end link of two sides facing each other and checking if a connection can be made which adheres to a set of preset conditions (see the settings which can be used to customize this at the end of this subchapter).

//Finding closest node
float closestDistance = float.MaxValue;
((intintint), Vector2foundSpot = ((int.MaxValue, int.MaxValue, int.MaxValue),;
Vector2 fEndSpot = GetEndLinkData(firstRectfirstEndSpot);
foreach ((intintintsecondEndSpot in secondSide)
    //Checking if connect limit already reached
    try { if (usedSecondEndSpots[secondEndSpot>= roadGenerationParameters.endConnectLimit) continue; } catch(KeyNotFoundException) { }
    //Checking if right road type
    if (!IsSuitableRoadType(secondRect.nodes[secondEndSpot.Item1].type, roadGenerationParameters.roadConnectionLimits[firstRect.nodes[firstEndSpot.Item1].type])) continue;
    Vector2 sEndSpot = GetEndLinkData(secondRectsecondEndSpot);
    float dist = Vector2.Distance(fEndSpotsEndSpot);
    if (dist < roadGenerationParameters.maxConnectDist && dist < closestDistance)
        foundSpot = (secondEndSpotsEndSpot);
        closestDistance = dist;

The code above first runs through a couple of checks like if this particular segment has already reached its maximum links or if its road type is flagged as connectable for this particular type. If these tests return positive, the distance is calculated between the two points. If this value is lower than the previously found distance it is recorded as the new shortest distance.

Eventually the two nodes with the shortest distance will be connected to one another by simply adding a link at the end of the first segment to the end of the second segment. This process can be seen below.

if (closestDistance != float.MaxValue)
    if (firstRect.nodes[firstEndSpot.Item1].linksGeos.Count == 0firstRect.nodes[firstEndSpot.Item1].linksGeos.Add(new LinksGeo() { links = new List<Vector2>() { firstRect.nodes[firstEndSpot.Item1].geometry_xy }, ParentNode = firstRect.nodes[firstEndSpot.Item1] });
    firstRect.nodes[firstEndSpot.Item1].linksGeos[firstEndSpot.Item2].ConnectedTo = secondRect.nodes[foundSpot.Item1.Item1];
    try { usedFirstEndSpots[firstEndSpot+= 1; } catch (KeyNotFoundException) { usedFirstEndSpots[firstEndSpot= 1; }
    try { usedSecondEndSpots[foundSpot.Item1] += 1; } catch (KeyNotFoundException) { usedSecondEndSpots[foundSpot.Item1] = 1; }
    Debug.DrawLine(new Vector3(fEndSpot.x, 0fEndSpot.y), new Vector3(foundSpot.Item2.x, 10foundSpot.Item2.y),, 10);

Eventually this will result in the following types of connections made between roads ending at the boundaries of their respective partitions.

The following settings are primarily used to customize this result:

  • S2.4: Max number of connections which can be made to a singular edge link
  • S2.5: Max distance for a connection to be made between edge links
  • S2.9: Selection of which roads are able to connect to which

5. Road mesh generation

Now that a road network is generated, the only thing remaining for the roads is to convert this into a mesh. Generally speaking this comes down to two different types of mesh generation:

  • Generation of the edges in between the nodes. These can exist of a variable number of links.
  • Generation of the intersections connecting the edges together. An additional data conversion was required for this.

5.1 Edge mesh generation

The most straightforward one of the at least two meshes that need to be generated for each separate node is definitely for the segments. For this a technique described in detail by Sebastian L.8 was used for generation of the vertices, triangles and uvs.

The basic idea comes down to iterating over each of the links inside an edge until the end is reached. For each of these links half the amount of the desired width is extruded in a perpendicular direction of the average forward vector between two successive links. This is done in both directions to achieve the full width of the road. These points will later on represent the vertices of the mesh. This process is also visualized within the images to the right.

The following for loop loops through all of the provided points and creates vertices on points which lie perpendicular to the actual part of the link

for(int i = 0i < roadParts.Count; i++vertIndex+=2triangleIndex += 6)
    //Calculating the forward vector for this part of the roadparts
    Vector2 forward =;
    if(i < roadParts.Count - 1)
        forward += roadParts[i + 1- roadParts[i];
    if(i > 0)
        forward += roadParts[i- roadParts[i - 1];
        countedLength += Vector2.Distance(roadParts[i], roadParts[i - 1]);
    Vector2 left = new Vector2(-forward.y, forward.x);
    //Setting the vertices
    Vector2 firstV = roadParts[i+ (left * (width / 2));
    Vector2 secondV = roadParts[i- (left * (width / 2));
    verts[vertIndex= new Vector3(firstV.x, 0firstV.y);
    verts[vertIndex + 1= new Vector3(secondV.x, 0secondV.y);
    //Setting the uvs
    //float progression = i / (float)(roadParts.Count - 1);
    float progression = countedLength / totalLength;
    uvs[vertIndex= new Vector2(0progression);
    uvs[vertIndex + 1= new Vector2(1progression);
    //Setting the triangles
    if(i < roadParts.Count - 1)
        triangles[triangleIndex= vertIndex;
        triangles[triangleIndex + 1= vertIndex + 2;
        triangles[triangleIndex + 2= vertIndex + 1;
        triangles[triangleIndex + 3= vertIndex + 1;
        triangles[triangleIndex + 4= vertIndex + 2;
        triangles[triangleIndex + 5= vertIndex + 3;

One of the keywords here is the width of the road. One of the aspects that comes with the dataset is the information about what type of road it represents. This information can be utilized by the user to customize a number of things ranging from connection limits and road inclusions (as discussed before) to in this case setting the width of the road per each individual type. Within the options provided to the user in (S3.2) customizations for a potential texture and tiling thereof are also available.

The following roads can now be produced using the method stated above.

Generation of an edge road mesh

5.2 Intersection generation

The more challenging part of the road generation was generating a mesh which connects any incoming edges to each other. In other words, generate a node. To start off with, up to this point nodes were only aware of their own edges (the ones originating from within this node). This was the first problem which needed to be taken care of at the loading phase of the nodes.

A solution for this problem is a new method which, after loading a partition or grouped section of nodes, seeks out any connections between these nodes. Once a connection has been detected this edge will be referenced from within the node as a so-called “external connection”. More details about these classes will be provided in chapter 6. Building placement.

Now that every node is aware of their outgoing edges as well as their incoming edges and can modify its values it’s time to generate the actual intersections. For this the following steps will be traversed:

  1. Each outgoing or incoming edge will be sorted in clockwise rotation from the center point of the node
  2. The edges will be extruded by a specified amount (S3.1) outwards from the node
  3. In the sorted order, each border of an edge road will be connected to its neighbouring border. These are the vertices.
  4. Triangles will be made from the outer vertices to a single vertex in the center of the intersection.

The steps discribed above are also illustrated down below.

Step by step intersection generation

The code for these steps can be divided into three parts. The first part takes care of the sorting, the second part of the extrusion and the third part of the actual mesh generation. These three are shown down below in their respective order.

//Sorting by rotation around y axis (again as now the external and internal nodes are concaternated)
Vector3 baseDir = new Vector3(roadConnectPoints[0].center.x, 0roadConnectPoints[0].center.y) - node.geometry_xyz;
    float angleP1 = Vector3.Angle(baseDirnew Vector3(, - node.geometry_xyz);
    if (Vector3.Cross(baseDirnew Vector3(, - node.geometry_xyz).< 0angleP1 = -angleP1;
    float angleP2 = Vector3.Angle(baseDirnew Vector3(, - node.geometry_xyz);
    if (Vector3.Cross(baseDirnew Vector3(, - node.geometry_xyz).< 0angleP2 = -angleP2;
    return angleP1.CompareTo(angleP2);
static RoadMeshGenerator.RoadConnectPoint GetConnectPoint(Vector2 nodePointVector2 linksGeoPointfloat roadWidthfloat extrusion)
    Vector3 nodePoint3 = new Vector3(nodePoint.x, 0nodePoint.y);
    Vector3 linksGeoPoint3 = new Vector3(linksGeoPoint.x, 0linksGeoPoint.y);
    Vector3 direction = (linksGeoPoint3 - nodePoint3).normalized;
    Vector3 dirTurned = Quaternion.AngleAxis(-90Vector3.up) * direction;
    Vector3 newCouplePointCenter = nodePoint3 + (direction * extrusion);
    Vector3 point1 = newCouplePointCenter + (dirTurned * (roadWidth * 0.5f));
    Vector3 point2 = newCouplePointCenter - (dirTurned * (roadWidth * 0.5f));
    RoadMeshGenerator.RoadConnectPoint roadConnectPoint = new RoadMeshGenerator.RoadConnectPoint()
        center = new Vector2(newCouplePointCenter.x, newCouplePointCenter.z),
        connectPoint1 = new Vector2(point1.x, point1.z),
        connectPoint2 = new Vector2(point2.x, point2.z)
    return roadConnectPoint;
else if(roadConnectPoints.Count > 2)
    //Placing center
    verts.Add(new Vector3(center.x, 0center.y));
    //Remaining points
    for(int i = 0i < roadConnectPoints.Count; i++)
        Vector3 v1Point = new Vector3(roadConnectPoints[i].connectPoint1.x, 0roadConnectPoints[i].connectPoint1.y);
        int v1Index = verts.Count - 1;
        Vector3 v2Point = new Vector3(roadConnectPoints[i].connectPoint2.x, 0roadConnectPoints[i].connectPoint2.y);
        int v2Index = verts.Count - 1;
        if(i < roadConnectPoints.Count - 1)
            triangles.Add(v2Index + 1);

Now that the intersections can be generated the following result can be achieved. Road edges now appear to be connected to one another.

Intersections generated based on connected nodes

The following settings are primarily used to customize this result:

  • (S3.1): Intersection extrusion

6. Building placement

One of the final elements of the city is the placement of buildings alongside roads which allow for this. Just like past chapters the process can be divided into steps which allow for a broader understanding within context of the entire pipeline.

  1. The generation of so-called plots alongside the roads in which the buildings can be build.
  2. The actual placement of buildings

6.1 Plot generation

The user can customize which roads support the placement of buildings by adding and removing them in the selection screen (S2.7). If a type is not selected, no buildings will be placed alongside this road.

For the generation of plots two methods were used which will both be thoroughly discussed within this chapter.

6.1.1 OBB plots for loops

The first method is based on an approach suggested by an article describing techniques for the generation of artificial parcels9. This particular approach describes the usage of an “Oriënted Bounding Box” (next: OBB) to subdivide the space in between a closed loop of roads. However, before this method can be adapted the data structure used for storing edges once again needs to be modified.

“This adaptive algorithm recursively splits a parcel into two smaller
parcels along the minor axis of the oriented bounding box
of the original parcel. The subdivision continues until userspecified shape attributes are satisfied.”

(C. Vanegas, 2012)9

In order for an OBB to be constructed, an algorithm needed to be adapted to find enclosed road loops within the nodes. The DCEL datastructure is the one chosen for this job. Adaptation of the DCEL data structure

A “Doubly Connected Edge List”10 (next: DCEL) is an edge type data structure in which each edge essentially both represents an edge moving from node A to B and from node B to A. Each of these edges also contains a reference to its direct twin which is moving in the opposite direction, to a previous edge and to a next edge. The idea behind this is by always taking the next edge in a clockwise rotation, you’re essentially always going to the left and will find any loops if these exist.

Illustration of the DCEL datastructure (Wikipedia)10

The problem with integrating this into the pre-existing edge data structure (called “LinksGeo”) is that this is entirely based on having edges be children of nodes. With other words, they only have a single parent. To solve this issue, two new classes were created: a special twin class called “LinksGeoTwin” and an abstract “LinksGeoEdge” class (from which linksGeo and LinksGeoTwin now inherit). All three classes can be seen below.

public abstract class LinksGeoEdge 
    public List<Vector2> links = new List<Vector2>();
    public abstract List<Vector2> Links { getset; }
    public Node parentNodeInternal, connectedToInternal;
    public abstract Node ParentNode { getset; }
    public abstract Node ConnectedTo { getset; }
    public LinksGeoEdge prevLinksGeo, nextLinksGeo;
    public LinksGeoEdge twin;
    public Face leftFace;
public class LinksGeo : LinksGeoEdge
    public override List<Vector2> Links { get => links; set => links = value; }
    public override Node ParentNode { get => parentNodeInternal; set { parentNodeInternal = value; twin.connectedToInternal = value; } }
    public override Node ConnectedTo { get => connectedToInternal; set { connectedToInternal = value; twin.parentNodeInternal = value; } }
    public LinksGeo()
        //Creating the twin
        twin = new LinksGeoTwin()
            parentNodeInternal = this.ConnectedTo,
            connectedToInternal = this.ParentNode,
            twin = this
    //Getter properties
    public float Length
            float totLength = 0;
            Vector2 prevLink = links[0];
            for(int i = 1i < links.Count; i++)
                totLength += Vector2.Distance(prevLink, links[i]);
            return totLength;
public class LinksGeoTwin : LinksGeoEdge
    public override List<Vector2> Links { get => twin.links.Reverse<Vector2>().ToList(); set => twin.links = value.Reverse<Vector2>().ToList(); }
    public override Node ParentNode { get => parentNodeInternal; set { parentNodeInternal = value; twin.connectedToInternal = value; } }
    public override Node ConnectedTo { get => connectedToInternal; set { connectedToInternal = value; twin.parentNodeInternal = value; } }

As can be seen LinksGeoTwin is nothing more than a redirection to the real LinksGeo class which holds the actual data. However, when a LinksGeoTwin is accessed as a LinksGeoEdge instance all properties return expected values despite the fact it is no actual LinksGeo Instance.

The new data structure is then filled using the following code.

public void DCELConnect()
    List<LinksGeoEdgeedgeList = SortLinksGeosClockwise();
    for(int i = 0i < edgeList.Count; i++)
        if (i == edgeList.Count - 1)
            edgeList[i].twin.nextLinksGeo = edgeList[0];
            edgeList[0].prevLinksGeo = edgeList[i].twin;
            edgeList[i].twin.nextLinksGeo = edgeList[i + 1];
            edgeList[i + 1].prevLinksGeo = edgeList[i].twin;

Now to construct a so-called “Face” of a partition (an actual rounded list of edges which connect to one another) another loop is done through all edges present. Once the same starting edge is found, a face is added.

foreach (LinksGeo lGeo in node.linksGeos)
    if (lGeo.leftFace != nullcontinue;
    Face newFace = new Face();
    LinksGeoEdge linksGeo = lGeo;
    int counter = 0;
    while (linksGeo.nextLinksGeo != null && counter < 10//Counter is implemented as a failsafe for when a face detection takes too long
        linksGeo.leftFace = newFace;
        linksGeo = linksGeo.nextLinksGeo;
        if (linksGeo == lGeobreak;
    if (linksGeo.nextLinksGeo == null || counter >= 10) { newFace.edges.ForEach((edge=> edge.leftFace = null); continue; }
    //Instructing face to generate plots
} Utilizing the OBB

Now that faces are in place, a library called geometry3sharp11 can be utilized to quickly calculate the OBB of this face. This is actually the call that is made at the bottom of the previous method when a new Face is constructed. This can be seen below.

ContMinBox2 boundingBox;
try { boundingBox = new ContMinBox2(points2d0.1QueryNumberType.QT_DOUBLE, false); }
catch (NotImplementedException)
    if (majorPoints != null//If major points provided, just using these as this is more reliable than all of the points
        points2d = new List<Vector2d>();
        majorPoints.ForEach((mPoint=> points2d.Add(mPoint));
    boundingBox = new ContMinBox2(points2d0.1QueryNumberType.QT_DOUBLE, true);
var center = (Vector2)boundingBox.MinBox.Center;
var axisX = (Vector2)boundingBox.MinBox.AxisX;
var axisY = (Vector2)boundingBox.MinBox.AxisY;
var extends = (Vector2)boundingBox.MinBox.Extent;
return new Box(boundingBox.MinBox);

As the generation of plots has not yet been fully achieved, the current functionality comprises of recursively subdividing the original OBB until a square of land can no longer be divided without going under some user defined minimum and maximum width constraints (S4.1), (S4.2). The final result of plot divisions currently returns the following plots.

OBB based plot subdivision result

6.1.2 Single roads

The second type of plot generation is much more straightforward in the fact that it’s actually nothing more than looping through the remainder of the edges (which are not part of a face) and placing plots along the boundaries of these roads.

For this the first step is to divide spots along this edge with a minimum and maximum distance width. For this purpose a specific method was thought out. First, a loop will go through all of the links inside an edge. Within each link a check will be executed if the distance between this link and the next lies within the boundaries of the supplied width parameter. If it does, or the next link is the end, the next link will be added as a second point of the plot along the first point.

However if it does exceed these boundaries a random position (that does lie within the boundaries) in between the two links will be chosen as the second spot and the cycle will continue as the new first spot will be the old second spot. The illustration below visualizes this process.

Illustration showing the process of subdividing an edge

When translated into code this looks like the following.

public static List<(Vector2Vector2)> GetRandomLinksDivision(List<Vector2linksfloat minDistancefloat maxDistance)
    List<(Vector2Vector2)> returnList = new List<(Vector2Vector2)>();
    Vector2? point1 = null;
    Vector2? point2 = null;
    for(int i = 0i < links.Count - 1i++)
        if (point1 == nullpoint1 = links[i];
        //Checking if distance with the next node is out of range
        float nextDist = Vector2.Distance((Vector2)point1links[i + 1]);
        while (nextDist > maxDistance)
            float ran = UnityEngine.Random.Range(minDistance / nextDistmaxDistance / nextDist);
            point2 = Vector2.Lerp((Vector2)point1links[i + 1], ran);
            returnList.Add(((Vector2)point1, (Vector2)point2));
            point1 = point2;
            nextDist = Vector2.Distance((Vector2)point1links[i + 1]);
        if(nextDist > minDistance || i == links.Count - 2)
    return returnList;

The only thing remaining is to expand the selected points backwards within the range of the specified min and max Y range and to take the width of the road type into account. This produces the following result.

Result of OBB and single edge plot generation

6.2 Filling plots

Originally the idea was to procedurally generate buildings which would be a perfect fit for the size of each generated plot. This, however was dropped in favor of placing a random “fitting” building chosen from a user specified selection (S4.3).

This means that the process of placing buildings is quite easy to grasp. For each generated plot the width is measured against the bounds of a random selection of the buildings. If it does not fit a new selection will be made whereas if it does fit the building will be instantiated at the center of the plot and rotated towards the XAxis direction of said plot.

            foreach(Face.Plot plot in plots)
        //Random building
        GameObject building;
        building = buildingGenerationParameters.buildingPrefabs[UnityEngine.Random.Range(0buildingGenerationParameters.buildingPrefabs.Count - 1)];
        for (int i = 0i < 2 && building.GetComponent<MeshFilter>().sharedMesh.bounds.extents.* 2 > plot.Width; i++)
            building = buildingGenerationParameters.buildingPrefabs[UnityEngine.Random.Range(0buildingGenerationParameters.buildingPrefabs.Count - 1)];
        GameObject wBuilding = GameObject.Instantiate(buildingplot.Center3, Quaternion.LookRotation(new Vector3(plot.axisX.x, 0plot.axisX.y)), parent.transform);

The result looks like the following.

Placement of buildings along roads example (trees are also specified as buildings)

7. Environment generation

Purely for aesthetic purposes the generation of a plane exactly the size of the city was incorporated into the tool. This is the entirety of stage 5 of the generation pipeline. From within the editor the user may choose to give a texture to the plane (S5.2) (normal and occlusion map also supported) or leave it at a solid color (S5.1). The plane can then be created using the button for stage 5 (S5.3).

8. Conclusion

Before anything can be concluded out of the evaluation of the main goal in relation to the actual achieved result it first is quite important to look back at the sub-goals / research questions which were stated at the beginning of this article. These questions should be able to be answered now that the project has concluded.

8.1 Answering research questions

“In what way can the acquired datasets be converted into a single readable format which allows for quick indexing and retrieval of a selection of nodes / edges?”

The process described in 2.2.3 exports the data into a single JSON file, a format which is widely accessible, where every node is partitioned into an array rounded down to the hundreds. This means retrieval of a selection of nodes and edges is quick provided they are loaded into memory.

“How can the converted data easily and quickly be loaded into Unity while customization of the eventual end result is still widely possible after this is completed?”

The same format used when exporting the data in Python is also used when loading the nodes in with C#. This is because this structure is easily transferrable using the JSON format. Because of this and the nature of the persistent data, quick retrieval, and thus customization, is possible after the data is loaded into memory.

“In what way can the data be easily partitioned / placed on a predetermined sized grid?”

As described in 4.2 a grid partition requires a certain range of nodes the size of that partition. The easy retrieval once again has to do with the way the data is partitioned.

“How can selections of nodes and edges belonging to different cities which potentially do not share the same internal structure be connected to each other while retaining a form of realism and plausibility?”

The main element here is the customization done by the user. Default all types of roads can be connected to all types of roads and this might result in some rather awkward connections. Using the settings provided in (S2) the ideal balance needs to be found.

“In which manner could road meshes be generated from the resulting nodes and edges which insure a smooth transition between different kinds of roads which inherently retain different characteristics?”

Roads are generated by creating a mesh which covers the edge links and an extra width. As described in 5.3, intersections play an integral role in this aspect as they connect edges of roads to one another regardless of the size of the road. This always insures a smooth transition.

“How can buildings be placed in available spaces (lots) in between the generated roads?”

First off, lots are created at locations where buildings should be placed. For this two techniques are used: OBB subdivision and Single edge traversal. The buildings themselves are currently just taken from a selection of prefabs.

8.2 The main goal

“To procedurally generate a city road layout which integrates various elements directly taken from a selection of pre-existing datasets based on actual real world cities”

In a way this goal has been achieved. It is indeed possible to load geographical data from different datasets and combining this to generate an entirely new layout. However, it is unlikely the first approach of random placement of segments bears much success of imitating a real city layout when most of the time road structures are so radically different from one another lining them up would be impossible. The later implemented method of group placement already greatly improves on this as now most segments are supposed to be next to each other and only bordering partitions of different cities need to be connected to each other.

Originally it was not the idea to divide the steps needed to achieve the goal in stages which is part of a grander pipeline. This, however, brings a lot more structure and understanding to the entire process and was something which came naturally during development of the tool.

9. Future prospects

The four weeks of the project were by no means enough to completely achieve all set out goals. Afterwards it’s clear most time was taken up by developing a way to connect nodes when these belong to a different city (week 2 and 3). In hindsight it should have been clear the random placement was never really going to lead to a solution which would look like a viable city. Together with a total of one week for data formatting this meant only one week remained to work on the generation of meshes. This means this is definitely the aspect where much more work can be put into. The following additions could be cosidered in order of importance:

  1. Finishing the OBB supported plot generation. Currently the initial OBB is just subdivided until the width is just right. This works up until a certain degree as the original OBB does not take smaller sections, which could be much more versatile, into account. Much of the code base for this aspect is already written but much more tuning and maturization is required.
  2. Using more of what’s already present. The dataset, and subsequently also the JSON file, comes with much more data than is utilized at the moment. Things like street names, speed limits or whether a road is a bridge or tunnel are all left unused.
  3. Implementing building mesh generation. This was initially the idea and would still be quite possible. This adds a layer of diversity and randomness to the final end result which makes it more unique.
  4. More environmental options (not that kind of environmental). Currently the only option available is to generate a plane with a texture. Later on trees and other types of vegetation could be implemented into this aspect. Even terrain generation could be possible if certain traits could be detected.

10. Sources

  1. Kelly, G. A Survey of Procedural Techniques for City Generation. Retrieved March 21, 2021, from
  2. Alireza, K. Amirhassan, K. Sybil, D. A protocol to convert spatial polyline data to network formats and applications to world urban road networks. Retrieved March 29, 2021, from
  3. Alireza, K. Amirhassan, K. Sybil, D. Cities datasets. Retrieved March 29, 2021, from
  4. (n.d.). Shapefile – Wikipedia. Retrieved March 29, 2021, from
  5. (n.d.). GeoPandas. Retrieved March 29, 2021, from
  6. (n.d.). JSON Serialization – Unity – Manual. Retrieved March 29, 2021, from
  7. James, N.K. Json.NET. Retrieved March 29, 2021, from
  8. Sebastian, L. [Unity] 2D Curve Editor (E06: road mesh). Retrieved March 29, 2021, from
  9. C. Vanegas, T. Kelly, B. Weber, J. Halatsch, D. Aliaga and P. Müller. (2012). Procedural Generation of Parcels in Urban Modeling – Purdue …. Retrieved March 30, 2021, from
  10. (n.d.). Doubly connected edge list – Wikipedia. Retrieved March 30, 2021, from
  11. Gradientspace. Geometry3sharp. Retrieved March 30, 2021, from

Related Posts