This project is a part of an attempt to explore properties of languages from the perspective of quantitative linguistics. We're examining similarities and differences among different languages across the globe using homoscedasticty and non-linear regression techniques.
- Creating of a processed directory of data files that contain metrices relevant for fitting a non-linear regression model. These metrices have to be "mean edge length" and "Second degree moment".
- Preparation of a script for producing summary statistics such as sample size, mean and standard deviation of sentence length (n) of syntactic dependency tree of each language and mean and standard deviation of the metric mean edge length for all sentences in a given language for each of the languages.
- Fitting of models with pending issues for
- Model fitting for each of the models
- Uisng aggregate and defining data filters
- Include results for model 0
- Create custom function for computing AIC and S^2 and delta AIC
- Table with the values of the parameters of giving the best fit for each model following the format of Table 3.
- Visualization
- Report writing