home ::
syllabus ::
groups ::
© 2024, Tim Menzies
(Week6 is a wellness day.)
Now we switch codes
- gate.py is history
- now we work from mylo.lua
Complete the following tasks using the data structures and algorithms in my code.
Here's an output where I loaded auto93.csv
, grabbed the first of rows
then sorted
the other rows by their distance to that first row, then printed:
- every 30th row
- the distance from that 30th row back to that first row.
Please reproduce this:
1 {8, 304, 193, 70, 1, 4732, 18.5, 10} 0.0
31 {8, 318, 150, 73, 1, 3399, 11, 20} 0.13
61 {8, 429, 198, 73, 1, 4952, 11.5, 10} 0.2
91 {6, 232, 100, 73, 1, 2945, 16, 20} 0.25
121 {6, 225, 95, 75, 1, 3264, 16, 20} 0.31
151 {8, 351, 142, 79, 1, 4054, 14.3, 20} 0.38
181 {4, 91, 70, 71, 1, 1955, 20.5, 30} 0.49
211 {4, 151, 90, 79, 1, 2556, 13.2, 30} 0.58
241 {4, 151, 90, 82, 1, 2950, 17.3, 30} 0.67
271 {4, 121, 110, 73, 2, 2660, 14, 20} 0.69
301 {4, 115, 95, 75, 2, 2694, 15, 20} 0.72
331 {4, 97, 78, 77, 2, 1940, 14.5, 30} 0.75
361 {4, 98, 76, 80, 2, 2144, 14.7, 40} 0.81
391 {4, 105, 74, 82, 2, 1980, 15.3, 40} 0.85
Using the fastmap heuristic, find two points that are .95 far each other. Eg.
From that code, using auto93.csv, print the generated tree (with best clusters at top, worse at bottom). Then print the number of Y evaluations used to generate that output:
far1: {4, 97, 88, 72, 3, 2100, 16.5, 30}
far2: {8, 305, 130, 79, 1, 3840, 15.4, 20}
distance = 0.75git a
Recursive random projections, generating clusters at each leaf. Report centroid of each leaf.
|..
|.. |..
|.. |.. |..
|.. |.. |.. |.. {3.96, 90.87, 0, 80.71, 3, 2045.71, 16.6, 35.42}
|.. |.. |.. |.. {3.88, 89.04, 0, 75.2, 3, 2087.4, 17.03, 28.8}
|.. |.. |..
|.. |.. |.. |.. {4.0, 125.92, 0, 81.68, 1, 2504.2, 16.52, 30.8}
|.. |.. |.. |.. {4.48, 129.68, 0, 78.28, 3, 2562.76, 14.96, 26.4}
|.. |..
|.. |.. |..
|.. |.. |.. |.. {4.2, 109.28, 0, 76.16, 2, 2462.92, 15.68, 26.8}
|.. |.. |.. |.. {4.0, 104.4, 0, 72.04, 2, 2309.24, 16.74, 26.0}
|.. |.. |..
|.. |.. |.. |.. {4.24, 110.24, 0, 77.76, 2, 2423.72, 17.77, 32.8}
|.. |.. |.. |.. {4.0, 122.96, 0, 78.52, 1, 2426.52, 16.04, 29.2}
|..
|.. |..
|.. |.. |..
|.. |.. |.. |.. {4.17, 128.6, 0, 73.38, 1, 2393.5, 16.91, 24.58}
|.. |.. |.. |.. {6.0, 232.0, 0, 74.96, 1, 3359.96, 17.18, 20.0}
|.. |.. |..
|.. |.. |.. |.. {6.0, 215.76, 0, 79.2, 1, 3194.44, 16.33, 22.0}
|.. |.. |.. |.. {7.6, 300.08, 0, 78.16, 1, 3727.64, 15.53, 20.4}
|.. |..
|.. |.. |..
|.. |.. |.. |.. {6.64, 255.2, 0, 71.12, 1, 3317.08, 14.82, 18.4}
|.. |.. |.. |.. {8.0, 311.96, 0, 74.44, 1, 4011.28, 13.55, 15.2}
|.. |.. |..
|.. |.. |.. |.. {8.0, 364.96, 0, 73.8, 1, 4353.44, 12.53, 12.8}
|.. |.. |.. |.. {8.0, 397.16, 0, 70.84, 1, 4286.92, 11.0, 12.4}
{5.45, 193.43, 0, 76.01, 1, 2970.42, 15.57, 23.84}
{Clndrs, Volume, HpX, Model, origin, Lbs-, Acc+, Mpg+}
evals: 30
Just prune half as you go. And print the surviving examples.
centroid of output cluster:
{3.96, 90.87, 0, 80.71, 3, 2045.71, 16.6, 35.42}
evals: 5
Here's something that works, but I can't explain why.
Pass1: run single descent optimization down to
- Cluster down to select 32 items
- Take those survivors and then cluster down to four
- Print the median and best found in that four
For example code on that, see https://github.com/timm/lo/blob/6jan24/src/mylo.lua#L734-L739.