How green the urban development units in Sofia are: Earth observation and population time series analysis

Over the last decades, the pressure that people and their activities put on the environment has increased. Green areas in many cities are diminishing in size due to urbanization, which inevitably leads to a decrease in quality of life. This study uses remote sensing (RS) data for Sofia, Bulgaria, for a period of nearly four decades, analyzing the dynamics of NDVI of the urban development units (UDUs). Statistics for NDVI per were calculated for each UDU for eleven dates in the following years: 1987, 1990, 1992, 1993, 1996, 2000, 2001, 2002, 2011, 2015, and 2020. An estimate was made of the amount of green vegetation per capita, similar to other coefficients used for population analysis. NDVI profiles for major urban parks showed differences for the studied period. Sentinel-2 data for 2020 was used for visualization of the current situation, in combination with detailed population data for all UDUs. The obtained data will help the decisionmaking process for the development of UDUs, while the methodology can be applied in any other city worldwide. ABSTRACT


Introduction
Urban green infrastructure (UGI) is one of the most important factors for good quality of life (Huang et al., 2018). In recent years, the public interest in environmental issues not only in Bulgaria, but worldwide, has grown significantly in terms of both urban and nonurban areas (Sarafova, Petrova, 2020). The structure of green areas also affects the cooling of urban areas (Zhang, 2017). Scientists, public bodies and organizations now have access to a variety of remote sensing data and related products to further study changes and develop future scenarios. In the next few years, the focus in Europe will be on green issues, especially those related to the Green Deal and its implications for urban development. For instance, as far as Sofia is concerned, the changes of the landscape structure of UGI have been analyzed in terms of their impact on the heat island (Dimitrov, Popov, Iliev, 2020), mapping (Vatseva, Kitev, Genchev, and targeted actions aimed at improving the access of the population to quality green areas (Un-Habitat, 2016).
The NDVI index has been used for decades and is one of the most popular indices for assessing the condition of vegetation. Its application is diverse, and in urban areas it can show how much healthy vegetation a particular area has. An obvious disadvantage of this type of quantification is the fact that it cannot distinguish between a well-maintained green area and sites such as abandoned terrains with overgrown vegetation. However, through the use of NDVI and the zonal statistics obtained for urban planning units, large differences can be reported and these data can complement existing quantitative and qualitative indicators (Abutaleb, 2020;Fung & Siu, 2000;Nouri et al., 2017).
A sophisticated database of the municipality of Sofia and its urban development units is maintained by the municipal entity "Sofiaplan", where passports with quantitative and qualitative characteristics have been created for all urban planning units (Sofiaplan, 2021). According to the team that developed the latest update of these units, they represent "territorial parts with similar morphological characteristics, taking into account the existing local structures of the functional spatial systems (by types: urbanized territories -labor, recreation, habitation, etc., forests, agricultural lands, waters)".
Because of the many datasets and satellite images now available, spatial analysis for urban green infrastructures' condition, size and proximity to the city's neighbourhoods, are widely used. The main objective of the present study is the evaluation of UDUs, based on satellite images, comparing data on the UDUs' NDVI median for different historical periods. The result will map the green UDUs, identify areas where there is almost no healthy green space, and all that for a period of nearly 40 years. Specific objectives are: 1) to relate the NDVI data and the derived zonal statistics with population data, in order to see how much green vegetation there is per capita in the respective UDU; 2) to compare the results for the whole time period 3) to further investigate the changes is some of the biggest urban green spaces in the studied territory.
In order to achieve the goal of the research, a methodology for analysis of time series from the archives of the Copernicus program and the Landsat program for the last nearly 40 years has been applied.

Study Area
Sofia-city is one of the 28 regions of the country (NUTS 3). The region's area is 1348.902 km². It is the most densely-populated area in Bulgaria and contains one municipality (LAU 1), covering the same territory. There are several mountains (Vitosha, parts of Stara Planina Mountain, Lyulin, Ihtimanska Sredna Gora, Plana). The center of the territory in discussion, where the majority of the people live, is the Sofia structural basin ( between 500-650 m and is home to more than 1.5 million people. In the southeastern part of the area, the Iskar Reservoir is located. It is the largest reservoir in the country, supplying drinking water to the capital. The longest river in Bulgaria -the Iskar River, flows from south to north, passing by Sofia near the international airport. Over the last several decades, the urban population worldwide has grown significantly. According to the UN, the urban population of the world grew rapidly from 751 million in 1950 to 4.2 billion in 2018. According to the National Statistical Institute, the urban population in Bulgaria constituted 75.7 % of the total as of 2020. Sofia is the capital and the largest city in the country, where approx. 1.5 million people live. Census data for the last several decades, show that the urban population has grown (Fig. 2).
According to the census data, the population of the capital increased in the period following 2001. The share of the urban population increased from 71.6% in 1934 to 95.3% in 2011 (Fig. 3.).
According to the National Statistical Institute (NSI) data, as of 31.12.2020, within the borders of Sofia-city region, both the largest city in the country (with a population of 1 221 785 people) and the largest village (Lozen, with a population of 6187 people) are located.

Methods and data
Comparing data received at the same time from the same sensor is the best way to correctly compare and draw conclusions about the state of the territory. However, the reality is quite different -in this part of Bulgaria the maximum rainfall occurs in the spring, and respectively -the presence of a cloud cover limits the possibilities for choosing clear and quality images. For this reason, we chose the clearest possible images for each of the years within the time-frame of the study, guided by the following: • Due to the availability of population census data, we included images from 1990, 2001 and 2011; • It was important to choose quality data to show the situation before 1989 -the year when the political and the economic situation in the country changed; • Despite the presence of clouds in a very small part of the studied territory, outside the capital, we chose the most appropriate image from 2011. This way we could link the data with the census statistics available from the National Statistical Institute. The current situation was analyzed in detail using Sentinel-2 data for 2020, which was correlated with the data on the population from the Directorate General for Civil Registration and Administrative Services for the same time period. Unfortunately, cloudless data for the period from May to August for the study area is not available. A compromise was made with an image from June 27, 2020, which crosses the easternmost part of the studied territory. These are UPUs with an extremely small number of people living in them and therefore were excluded from the analysis and the general statistics.
In order to further investigate the condition of the major urban green areas, NDVI profiles for the whole time period were generated. By generating the profiles it is possible to track the change of values for the whole time period and thus locate the places where construction or other negative factors affect the condition of the green areas.
QGIS 3.16.2 was used in performing the analysis, following the procedure shown in Fig 4. Data from the Landsat program archive, as well as data from Sentinel-2 from 2020, was used to create the time series, as described in Table 1 below.
The formula used to calculate NDVI is: To calculate the values in Raster Calculator, the following sensors and spectral bands were used respectively:  The archives of the Landsat program are an extremely valuable resource through which data for the last almost 40 years can be compared. The data is widely used for time series analysis worldwide and represents a reliable source of information.

Sentinel-2
Sentinel-2 data was used in terms of analysis of the current situation of the territory. The spectral bands used for the calculation of NDVI -Band 4 and Band 8 -are defined in the technical documentation. The application of data from this source is promising, given the much more appropriate (in comparison with Landsat 5 and Landsat 7) pixel size of 10 m. However, due to the different pixel size and other considerations, we didn't include the image analysis data from Sentinel-2 in the overall comparison of the time series. Those are described in a separate section in the paper.

Population Statistics
Cities are home to the people, so the use of different indicators to assess the urban environment is extremely important. In order to be able to calculate how much green, live vegetation per person there is for each UDU, the following data was used (

NDVI Time Series
NDVI was calculated for all 10 dates of the selected satellite images (Fig. 5.). The specific dates differ for each year, which complements the many other factors related to the condition of the vegetation, such as whether the year was wet or dry, in which month the image was taken, etc.
NDVI minimum and maximum values for each year are shown in Table 3 below.
How green the urban development units in Sofia are: Earth observation and population time series analysis   The raster statistics were calculated with a Zonal Statistics tool for each UDU, for the following parameters: The resulting layers are visualized in Fig. 6 with their median values.
The results show that the UDUs in the outskirts of the city have more live green vegetation that the ones in the city center, which is understandable given the geographical characteristics of the territory. Some parts of the city center and the largest neighbourhoods, where hundreds of thousands of people live, also have low median values.

NDVI Historical Profiles
Urban green zones were further investigated with NDVI profiles (Fig. 7).
For each raster a colour was defined, depending on the year which the values correspond to (the darkest colour being the oldest and the lightest being the most recent data).

Borisova Garden Park
The values for this park in Fig. 8. coincide almost everywhere, while in those sections where the boulevards cross the park, the values fall sharply down, which can be clearly seen in the image.

South Park
In recent years parts of the park -near Lozenets neighbourhood -were built-up, and this can be seen at the end of the third profile. In the 1980s there were trees and other park vegetation, and now there is a new neighbourhood named after the park (Fig. 9).

West Park
The main part of the park, which is mostly used for leisure, has same profiles for all of the years included in the time-frame of the study (the 3rd profile below). The first two parts of the park show differences in terms of NDVI values, the reason for which could be further investigated (Fig. 10.).

Other parks
The other parks' profiles ( Fig. 11) show similar values. NDVI of Zaimov Park is almost the same for all years within the time frame of the study, dropping down where a theatre building is located (seen at the end of the graph).    The other two parks -Petar and Pavel Park and North Parkagain show nearly same NDVI values for all years within the time frame of the study.

NDVI Per Capita
The resulting map from the calculation of NDVI values divided by the number of people living in the corresponding UDU can be seen in Fig. 12. Again the outskirts of the city have more live green vegetation per capita than the ones in the city center. For some USUs there are "false" positive results, for example -places where there are abandoned areas. However, when comparing the individual UDUs, important conclusions can be drawn, precisely for the neighbourhoods where many people live.

NDVI Stats -2020
For 2020, Sentinel-2 bands were used to generate NDVI statistics. The results are shown in fig. 13. The highest median and per capita values for UDUs are observed in parts of Pancharevo, Bankya and Vitosha. The reason for that is that these residential areas are mostly represented by houses, rather than apartment buildings, surrounded by mountainous areas.
Data per capita shows that the lowest values exhibits the UDU of Studentski city district (0.066), whereas the highest values are observed in the villages in the surrounding territory -Podgumer, Mramor, German, etc.

Discussion
The use of large data and a series of analysis for different periods can show trends and processes invisible to researchers or managers of the respective territorial unit. Adding statistical data derived from NDVI to the passports of each UDU can give additional ideas for the future development of each part of the region. At the same time, however, this type of data should be used in combination with an analysis of the accessibility to parks and green areas. This study generated data and identified differences in the index for a period of almost 40 years. The methodology used shows large differences within the study area in terms of the amount of green vegetation that is available in different parts of the city: • There are densely-populated areas with almost no green vegetation. • The outskirts of the city near the surrounding mountains have higher median and per capita NDVI values than the city center. • The NDVI median for 2020 shows lower values in the heart of the city, surrounded by a small green ring around it. • The two biggest neighbourhoods -Lyulin and Mladost -are surrounded by UDUs with higher median values. • The NDVI profiles show similar values for the whole time period, which is due to the continuous maintenance of the parks in the capital. Some of the disadvantages of the methodology used in this research include: • the data shows the condition for each UDU, but not how close each person's home is to a green area. A typical example of this is the data from 2020, which shows that the lowest values per capita are for parts of the Studentski city district, which however, borders the National Sports Academy, as well as Studentski Park, which in this case are separate UDUs; • the differences in the size of the pixels for the time period of almost 40 years, as well as the specifics of the processing of these data, inevitably leads to differences in the way the analyses look; • using Landsat 5 and Landsat 7 data for urban analysis should be done with caution because of the mixed landcover that falls into each pixel Despite these considerations, the application of historical data to the UDU passports can contribute to the decision-making process for the development of the city's green infrastructure. With regard to the population, analysis could be done not only for the availability of healthy vegetation per capita, but also by age groups, to see if neighbourhoods inhabited mainly by young people with children are supplied with enough green vegetation.

Conclusion
There is a specific Sustainable Development Goal (SDG) that concerns sustainable cities and communities, namely -Goal 11, which is 'to make cities inclusive, safe, resilient and sustainable'; as such, this particular goal is closely aligned with the Europe 2020 growth strategy 'to become a smart, sustainable and inclusive economy' (UN, 2021). Copernicus data provides new opportunities for the assessment of urban areas. Comparisons of data from different periods, especially in combination with Landsat data from the past four decades, shows the dynamics of the urban environment in a new, intriguing way. After performing the analysis, this study shows that the most populated UDUs have less green areas. Also, it's clear that people living in the suburbs of the city enjoy more live, green vegetation in their surroundings. NDVI has been used for a long time worldwide as an indicator for the condition of the vegetation. It could be used as a starting point in the analysis of UDUs' statistics, and further on combined with various statistics, related to population and the quality of urban areas.

Аcknowledgement
The study was part of the project "Green Urban Areas and Public Health -Analysis and Evaluation of the Four Largest Bulgarian Cities through Remote Sensing and Crowdsourcing Data from Mobile Applications", National Scholarship Program "For the Women in Science" 2018/2019, implemented with the support of L'ORÉAL Bulgaria, National Commission for UNESCO -Bulgaria and Sofia University "St. Kliment Ohridski ".