home ::
syllabus ::
groups ::
chat ::
© 2023 timm
Lectures | Timetable | |
This subject is a peek under the hood of under the hood of data mining, optimization, theorem proving and all the other tricks of automated software engineering. This is a project-based class where students will use any scripting language they like to build and extend their own AI tools for software engineering. 500 level students will work in groups. Topics covered: scripting; clustering; optimization; data mining; theorem proving; requirements engineering; discretization ; explainable AI, ; statistics for experimental algorithms,; hyper parameter optimization; deep learning Grading: 21% homeworks, 40% exams; 36% for large end-of-term project. Textbook: none |
Jan19 : hw1 Jan26 : hw2 Feb2 : hw3 Feb9 : hw4 Feb16 : hw5 Mar6 : hw6 Mar8(wed) : mid-term, 4:30 Mar16 (break) Mar21 : hw7 Mar30 : Last day to submit revised homeworks Apr20 : project |
Rahul's first rule: when it is wrong, it so confidently wrong that you cannot tell.
- wrong, on several points
- security
- hard-wired constants that should be adjustable
- does not support symbolics
- does not support streaming
- does not support hierarchy
- nothing about connecting clustering to sampling and optimization
398 {:Acc+ 15.6 :Lbs- 2970.4 :Mpg+ 23.8}
| 199
| | 99
| | | 49
| | | | 24 {:Acc+ 17.3 :Lbs- 2623.5 :Mpg+ 30.4}
| | | | 25 {:Acc+ 16.3 :Lbs- 2693.4 :Mpg+ 29.2}
| | | 50
| | | | 25 {:Acc+ 15.8 :Lbs- 2446.1 :Mpg+ 27.2}
| | | | 25 {:Acc+ 16.7 :Lbs- 2309.2 :Mpg+ 26.0}
| | 100
| | | 50
| | | | 25 {:Acc+ 16.2 :Lbs- 2362.5 :Mpg+ 32.0}
| | | | 25 {:Acc+ 16.4 :Lbs- 2184.1 :Mpg+ 34.8}
| | | 50
| | | | 25 {:Acc+ 16.2 :Lbs- 2185.8 :Mpg+ 29.6} <== best?
| | | | 25 {:Acc+ 16.3 :Lbs- 2179.4 :Mpg+ 26.4}
| 199
| | 99
| | | 49
| | | | 24 {:Acc+ 16.6 :Lbs- 2716.9 :Mpg+ 22.5}
| | | | 25 {:Acc+ 16.1 :Lbs- 3063.5 :Mpg+ 20.4}
| | | 50
| | | | 25 {:Acc+ 17.4 :Lbs- 3104.6 :Mpg+ 21.6}
| | | | 25 {:Acc+ 16.3 :Lbs- 3145.6 :Mpg+ 22.0}
| | 100
| | | 50
| | | | 25 {:Acc+ 12.4 :Lbs- 4320.5 :Mpg+ 12.4}
| | | | 25 {:Acc+ 11.3 :Lbs- 4194.2 :Mpg+ 12.8} <== worst
| | | 50
| | | | 25 {:Acc+ 13.7 :Lbs- 4143.1 :Mpg+ 18.0}
| | | | 25 {:Acc+ 14.4 :Lbs- 3830.2 :Mpg+ 16.4}
Now here's nearly the same algorithm, but know we run a greedy search over the splits. When splitting on two distance points A,B, we peek at the Y values and ignore the worse half.
398 {:Acc+ 15.6 :Lbs- 2970.4 :Mpg+ 23.8}
| 199
| | 100
| | | 50
| | | | 25 {:Acc+ 17.2 :Lbs- 2001.0 :Mpg+ 33.2}
- nothing about bias
My main problem: cognitive fixation:
"If it was up to current thinking (in the 1920s) to cure polio. . . You'd have the best iron lung in the world but not a polio vaccine."
Rahul's second rule: you need to know a lot about the code to to make best use of automatically generated code.
This subject: under the hood of data mining, optimization, theorem proving and all the other tricks of automated software engineering