Geographical analyses and artificial intelligence for the study of late medieval settlements in Southern Tuscany ( Italy ) Analyses géographiques et intelligence artificielle pour l ’ étude des implantations de la fin du Moyen âge en Toscane méridionale ( Italie )

This paper presents how Artificial Neural Networks (ANN), integrated in a GIS platform, has been used to analyse ancient settlement patterns, in particular fortified villages and medieval settlements in central Italy. The idea at the base of this method is that each settlement system can be seen as an outcome of a social and territorial background. Through the use of different data such as satellite images, historical cartography and archaeological evidences, it is possible to individuate, with the help of ANN methodologies, some of the links between human settlements and the territory. This methodology allows a double approach: firstly the analysis of ancient settlement pattern distribution itself, and secondly the identification of undiscovered archaeological sites. Particular focus will be put on evaluation and quantification of the relationship between late medieval castles (Xth-XIIIth Century A.D.) and environmental variables, and between actual and ancient land use, with the integration of satellite imagery, in particular Landsat, and the use of historical cartography. Cet article montre comment les réseaux de neurones artificiels (ANN), intégrés dans une plateforme GIS, ont été utilisés pour analyser les modèles d’anciens habitats, en particulier des villages fortifiés et des habitats médiévaux en Italie centrale. L’idée de base de cette méthode est que chaque habitat peut être interprété comme une expression du contexte social et territorial. Utilisant des données différentes comme les images satellitaires, la cartographie historique et les évidences archéologiques, il est possible de localiser, grâce à l’utilisation des ANN, quelques-uns des liens présents entre l’habitat humain et le territoire. Cette méthode permet une double approche: l’analyse des habitats mêmes et l’identification de sites inconnus. Une attention particulière a été portée à l’analyse et à la quantification des relations entre les châteaux (XèmeXIIIème siècles) et aux variables territoriales, mais aussi à l’utilisation du sol, à l’intégration d’images satellitaires, en particulier Landsat, et à l’utilisation de la cartographie historique.

determined those choices.Moreover, the inner structure of computers, generally based on the concept of true/false and same/different logic, is totally different from the human way of thinking, with the risk of being useful only for tasks like storing, organization and editing of the data.However, especially in the last two decades, the use of artificial intelligence and fuzzy based approaches has increased to a higher level in several disciplines and in particular Artificial Neural Networks (ANNs), which is one of the most successful applications in several areas (Abdi et al., 1999).They also present various aspects suitable for geographical and archaeological research and more generally for the study of historical issues, first of all because they can analyse partial, incomplete and fuzzy data, and the complex relationships between initial variables and the results of a certain process.
Applying these methods through a spatial approach makes it possible not only to engage the problem from different perspectives and to observe the invisible relationships between human settlements and their location, but also to highlight the relationships between different kinds of settlements.This operation can be done in a synchronic way, with data of the same period but from different areas, as well as in a diachronic way, with data from the same area put on a timeline, which allow us to see the changes, continuities and discontinuities.
Due to their characteristics ANNs can be applied on several problems of archaeological and geographical themes and in this case a double approach has been investigated: the creation of predictivity or risk maps on one hand, and surveys and excavations planning on the other.The entire detailed process presented in this paper is explained in a manual available in open access (Deravignone et al., 2013) as well as the software developed for pattern creation and raster layers processing.

Spatial analysis, archaeology and AI
Disciplines like historical and human geography or archaeology are characterized by incomplete, confused and very heterogeneous data, so that their analysis and interpretation always present a margin of ambiguity.In general the very nature of these kinds of datasets must be seen as a partial sample of the total population record.In this case it was attempted to merge AI to GIS in order to increment the range of territorial analyses for the comprehension and study of ancient settlements.Spatial analysis is aimed at describ ing complex themes, barely interpretable by themselves, in a synthetic and significant way by processing data with mathematical algorithms, mainly to create models of reality.In recent years the contribution of AI techniques to this field has become more consistent (see Barcelo, 1993Barcelo, , 2009;;Ducke, 2003;Ganascia et al., 1996;Reeler, 1999;Zubrow, 2003) and the differences b etween these methods and the so-called "traditional" ones are mostly based on the ability of AI to analy se and manage complex, non-linear processes in a better way.In this case the relationships between different sites reach a high degree of complexity which is impossible to reduce to a linear model, with the risk of obtaining improper results.The approach is based on the union of the GIS tool, characterized by the ability to analyse in a quantitative way huge amounts of data, and an ANN simulator.Thanks to raster mapping, which informs us about territorial features, we can understand if in a certain territory there is a possibility to find undiscovered sites.Due to their characteristics, ANNs are particularly used in the field of predictivity.Nowadays, they represent one of the central tools in archaeological studies, especially for the compilation of the so -called "risk assessment maps" (Hobbs, 1997;Leusen et al., 2005).They are generally used as planning tools for field surveys, in order to individuate the areas with the highest probability of finding archaeological sites.The main aim is also to avoid, especially in the case of the construction of new buildings and infrastructures, to damage or destroy an archeological deposit that could be present in the underground.Moreover this strategy is a true money saver in the event of large scale interventions, especially for areas with scarce visibility or difficult access.In this case the method was applied with a double aim: firstly for the analysis of ancient settlement patterns distribution, and secondly for the identification of undiscovered archaeological sites.
Castles in southern Tuscany and their spatial pattern 8 Castles and their settlement pattern represent one of the main themes of medieval archaeology in Mediterranean Europe and beyond, because of their importance as precious indicators in the study of medieval society, politics, and economy.The evolution phase of rural settlement patterns in a large part of Europe from the X th century bears the name of "incastellamento" in Italy and seems to represent only a part of a more complex phenomenon: this new reality, the castle, should not be understood only as a noble residence, but as a central point for the organization of rural areas, meaning here the opposite of urban realities.The actual knowledge of castles in southern Tuscany shows some elements of continuity that ratifies the reinforcement of an already established power coming from the medieval village (Francovich & Ginatempo, 2000).From this perspective castles are not to be conceived as a break with the past structures, differently from what emerges from the studies conducted in Latium, where these new entities seem to appear in places not settled before (Toubert, 1973).

9
From the archaeological point of view the early Middle Ages do not offer rich material evidence for a better comprehension of the phenomena.The major problem is related to the difficult interpretations of excavations, caused by the almost exclusive use of wood structures, whereof pole holes are a very feeble trace, not often conserved.In the evolution of settlement patterns there is also another ongrowing phase to be considered, the so-called "secondo incastellamento" during the XII th Century (Francovich, 1995).Some distinctive traces of this phase are a higher aggregation of population into castles, causing abandonment of villages and the origin of a new land arrangement with the socalled "terre nuove".The causes of these modifications are various and the agents of this change are a heterogeneous group formed by little communities or single holders, both aristocratic and ecclesiastic, or officials.
10 The quantitative approach applied to these phenomena finds its roots in the settlement archaeology and takes into account the aspects related to environmental variables.The structure and function of castles are strictly connected to the surrounding territories, with deep connections with economic factors that constitute one of the main components of the power.From this point of view the "incastellamento" process, including its first and second phases, can be seen as a measure of the different realities of the Middle Ages, and can be used as a spyhole for social changes (Wickham, 1984).
Geographical analyses and artificial intelligence for the study of late medie...

Belgeo, 4 | 2014
The area of interest and the involved variables The region of interest for conducting these analyses is the south-eastern part of Tuscany, Italy, formed by the Provinces of Grosseto, Siena and Arezzo, with a particular focus on castles settled before 1150 AD.From the beginning a double approach has been applied: predictivity on the one hand and the study of settlement patterns on the other.In the first approach somewhat heterogeneous patterns, consisting of records from different areas and periods, have been used and in the second only castles from a certain territory or time period.This was done especially to highlight possible differences and similarities, continuities and discontinuities, both in a diachronic and synchronic approach.
The choice of variables to be considered initially included altitude above sea level, mountain and hill slope, aspect (slope direction), linear distance and "cost distance" from rivers, from sea, and dioceses.By "cost distance" we mean, in a raster GIS environment, the least accumulative cost path from a source, but instead of calculating the actual distance from one location to another (Euclidean approach), it determines the shortest weighted distance from each cell to the nearest source location.The purpose was to reach a wider description and understanding using variables that represent in a very comprehensive way the quality and features of each archaeological site.For simplicity we can group the different variables that have been considered in several groups, taking into account that each variable can belong to more than one group.A navigable river, for example, can be seen as a water supply or a way of communication.Overall, we can distinguish some macro types of variables: morphological, environmental, movement related and economy related.
At the same time, in the case of historical landscapes, it has to be considered that some variables cannot be taken into account, depending on the landscape changes, or that others are not directly available, forcing to use some proxies.In this case historical documentation and cartography are of help, even if the possibility of finding useful maps and descriptions decreases in direct relationship with the age.
A very simple list of macro categories could include, for example, morphological aspects, geological aspects, hydrographical aspects, raw material supplies, distance from other sites, viability, climate, and so on.All the variables contained in these or in other categories can be made of two types, quantitative and qualitative, that from our point of view mean measurable and not measurable.ANNs allow performing an analysis containing both at the same time, and expressing them, depending on their characteristics, in a continuous form (from -1 to 1) or Boolean form (1 or 0).
Thanks to these premises, it is possible to apply this methodology in a double analytical approach: analysis of the relationships between settlement and territory, and analysis of the differences or similarities between different contexts.This can be performed in a synchronic way, by analysing different territories in the same period, and in a diachronic way, by analysing the evolution of settlements of the same area, highlighting continuities, discontinuities and possible interruptions in time.The analyses presented in this paper are based on feed forward neural network methodology and have been performed in three different rounds including each time new features as explained in the next paragraphs.

The methodology
The methodology explained here aims at adding to ANN analyses a spatial aspect using a GIS platform (Deravignone & Macchi, 2006).At the beginning of the methodology development an important aspect was considered, namely that it should be able to run on almost any GIS software and on different operating systems in order to allow everyone to reproduce these analyses independently from the platform used (ArcGIS, Quantum GIS, and many others).For the same reason we took the decision to release all the developed tools under Open Source license, at the same time as open file formats like binary grids and tab-delimited text files used in almost all GIS environments.The first step was to create a conceptual model able to reproduce the input-output features necessary for the ANN.This was made possible by using vector point data for inputs and raster surfaces for outputs.The points, derived from the ASFT database (Francovich, 1995) and present in the GIS platform, represent castles.Each point is related to a series of raster features, representing the many variables, lying on a certain area contained inside one of their cells as shown in figure 2.
Geographical analyses and artificial intelligence for the study of late medie...
18 For example, for each point there will be the degree of slope on which it lays, the altitude, the distance from the closest river and so on.This is true not only for the castles, but also for the so-called "negative points".These must be understood as places where there are no castles and are necessary for a supervised Neural Network approach.The procedure for the creation of this negative point pattern should take into account areas involved in land surveys (so that it is clear that there are no castles), also excluding all the areas that are too close to the actual castles.At the end of these sampling procedures we need to create a random pattern, with similar density and aggregation of real castles in the same area.Regarding the number of these non-site points it must be much higher than the number of actual sites.In fact, if the number is the same, we can say that 50% of the territory is occupied by castles.
19 The group formed by all the different values (variables) that characterize each point (castles) forms the "training set" used for the ANN training.During this phase only a few cells are taken into account, those occupied by actual and non-sites.Once the ANN is trained, the raster layers are used in their completeness in order to process the entire territories.The output is a single raster layer that indicates, with values from 0 to 1, the "probability" of the presence of a castle.By this method we can easily compare different territories or time periods: by training a network on a certain territory/period and then using another one for the processing.

Analyses and results
The purpose of these analyses was to reach a wider description and understanding using variables that represent in a much comprehensive way the quality and features of each archaeological site.To improve the first analyses, an important step forward was based on the assumption that not only the site location is important, but also the probability values of surrounding areas.For medieval settlements, as well as for other kinds of sites, the characteristics of the surroundings represent a primary source of information.This is due to the fact that rural settlements are not only residences for human population but also productive units highly integrated in the territory.
This insight was operationalized in practice by performing a simple raster shift of the desired number of meters.New analyses were performed adding 8 shifts (north, northeast, east, etc. around the compass) relative to all the variables, especially focusing on the importance of those concerning the geomorphology.These shifts were performed at different distances from the actual location of the cells and showed empirically that the best results were obtained with shifts between 100 and 300 m.The difference between the results of the two analyses, the first using only straight morphological variables of all the considered area, and the second including also those from surrounding areas, was clearly visible, particularly when looking at the huge decrease of the high probability areas (Deravignone & Macchi, 2006) as explained below.
Furthermore another improvement was the use of Landsat TM5 satellite imagery as new variables.With a resolution of 30 m this is not suitable for shape recognition, but is advantageous for the study of other important variables: certain bands in fact, like infrared, give information not even visible to the human eye, thus adding much more information on, for example, the humidity, or even the "quality" of soil.In this case (that we can identify as the third run) the differences with the previous analyses were clearly marked and allowed us to exclude at least 80 percent of our case study area, isolating high probability areas much easier and better focused.After a reclassification of the results it was obvious that all areas above the 98 percent value should be considered as high probability areas with the use of test patterns consisting of randomly excluded castles as explained below.
Comparing the results it can be noticed how the morphology of the site, of its surroundings, and the proximity to main resources (water, raw materials, etc.) are certainly very important to consider for this kind of analyses.The significance of the test was initially measured in the laboratory, using the GIS platform.The resulting raster values were re-imported in test patterns made of randomly excluded castles, left outside the analyses only for testing purposes, showing that almost 2/3 of the castles were over 80 percent probability.
As said before, the ANN approach was used as a numerical model that may be applied in another territory in order to measure the similarities or differences between two settlement patterns, training a network on Grosseto province, for example, and use it to analyse the Siena area.In this case the link or association between the two archaeological areas will be based on a fully integrated model of the settlement system with its specific environment.Even if they offer many possibilities, the analyses of results are particularly difficult, especially when it comes to the "black box" problem.This is typical of neural networks since they allow to "see" and analyze the input and the output, but not clearly what is in the middle of the processing phase.
The same can be stated about analyses conducted with the same approach, but using different time frames instead of using different areas.In this case the only easily visible difference looking at the resulting raster maps was that, going further on in time and using the same number of records, there was an increase of high probability areas.
A necessary further step was to test the analyses in the field.The aim was to choose an area with little information about archaeological sites due to, for example, the high level of forestation, and the subsequent impossibility to perform a standard survey.In this event, the area corresponding to the old Volterra diocese, in particular the Berignone wood area, which is today a natural park characterized by very dense vegetation, was selected (Deravignone, 2009).After performing the analyses relative to the old diocese area, the high probability portions were exported into a GPS device in order to be able to localize them during survey.This led to the discovery of several new sites, many of them with elevated features or walls.Besides the fact that it is still premature to relate these sites as hilltop settlements, a positive fact is that, by analysing the numerous potteries found in all the recognized sites, it was possible to date all the sites from the XI th to around the end of the XIII th century, the same period used in the training process.
Geographical analyses and artificial intelligence for the study of late medie...

Belgeo, 4 | 2014
A different approach: ANN and ancient land use Land use is strictly related not only to environmental characteristics, but also to the settlements patterns.Its study with ANN methodologies was also considered in order to see if it was possible to individuate the main tendencies present during the first decades of XIX th century in a small area of the current Siena province, in southern Tuscany (Deravignone, 2011).The starting basis for this period was the "Leopoldino cadaster" (Biagioli, 1975) that covers almost the whole Tuscan territory.Also in this case the data collection was systematically performed in a GIS environment vectorializing the cadaster with all its features regarding each parcel and its relative attributes like land use, owner, amount of taxes and so on.
The basic idea is the following: having different categories of land use, where are they going to be placed in a certain territory, considering its geomorphologic and settling characteristics?Obviously, far from considering a deterministic approach, the results are not intended to be universally applicable, but as general indicators that may be useful in similar environments and for some basic macro categories such as cultivated fields, tree plantations, pastures and woods.
Some of these categories are strictly related to human settlements.Orchards for example can be considered as an anthropization indicator, because they need particular attention that involved, especially in ancient times, certain distances from urban or settled areas.
For performing these analyses a network was trained on the territory of San Quirico d'Orcia and then applied to the Buonconvento area (Siena, Italy).In the training phase all land use categories were treated separately creating one map for each use and then setting a threshold based on the percent of territory occupied by that category in order to visualize all the types on a single map.Looking at the results it is clearly visible how much the land use is influenced by the These results, that at first glance may appear obvious, demonstrate again how much settlements and geomorphology affect the land use.The organization of the territory, of cultivated and wild areas is strictly connected to and based on simple features, maintaining at the same time characteristics of high complexity.

Conclusions
The initial aim to create a procedure (https://goo.gl/AWnA8J)covering the entire process that goes from data creation to on site survey has been satisfied and tested in different environments and with different site types.It can be stated that ANN methodology allows detecting variables that apparently characterize the underlying invisible relationships between territory and settlement patterns, while the results form the basis for a fully integrated model of the settlement system with its specific environment.The numerical model will allow researchers to observe and compare results on the timeline and in different territories, while the application of this method to land use was a first try to compare the relationships between settlements, landscape, food production and natural environments.
The "incastellamento" appears again to be a very complex phenomenon and our analysis has highlighted some of the different relationships between castles and their environment.Also interesting were the results related to the use of variables not directly connected to the settlement pattern, like Landsat satellite data.Underlying possibilities are still distant and numerous are the gaps especially on what concerns the choice of variables and their utilization, but the improvements will certainly guarantee archaeologists a greater number of possibilities to perform these analyses.Geographical analyses and artificial intelligence for the study of late medie... Belgeo, 4 | 2014

Figure 1 .
Figure 1.Italy and Tuscany with the area of interest and castles case study.

Figure 4 .
Figure 4. Detail of the land use of Buonconvento area at the beginning of XXI st Century from the Leopoldino Cadaster (on the left) compared to the result obtained by ANN analysis (on the right).
Geographical analyses and artificial intelligence for the study of late medie...
Geographical analyses and artificial intelligence for the study of late medie... : plain areas and valley bottoms are usually dedicated to cultivated fields while steep areas to pasture.Indeed, one can observe a direct relationship between settlements and the aforesaid land use types, as in the following examples:• Almost all tree plantations are very close to settlements and big farms • Cultivated fields are almost exclusively in valley bottoms, plains and clay hill tops • Pastures and forests are mostly on hill slopes and high slopes slope