3D object manipulation with hands via camera
A simple demo (with SHOW_SELFIE = True):
- Mayavi
- MediaPipe
- PyQT5
- VTK
- Vedo
These codebases power the bulk of this project.
- https://github.com/marcomusy/vedo -- Vedo is a Python library for visualizing and manipulating data in 3D. The maintainer (Marco Musy) is very helpful and has included a lot of examples to get started for just about any data visualization task.
- https://google.github.io/mediapipe/ -- MediaPipe was created by Google. Its main purpose is to perform pose estimation (hand, face, body, etc.) and process them with a set of computer vision algorithms. One great application of this is to detect hand gestures (there are several cool projects using it to "read" sign language).
There are plenty of applications at the intersection of 3D visualization and computer vision. Here are a few that came to mind (add your own!):
- Manipulating any 3D objects to understand them better (think: cell images, topographical models, asteroid data, etc.)
- Medical Field: Practice with surgery (would be extremely crude!)
- Games: (e.g. "Ping Pong" with a virtual table)
- 3D Printing (my original use-case to examine files closely)
- General Engineering: Helping to design and view CAD models
Currently working on:
- Allowing multiple objects to be rendered
- Selection of individual objects
- Slicing with plane
Other things that must be done:
-
Much refactoring + cleanup needed
- Zoom bug (two hands): zooming switches direction - b/c it cannot tell which hand was detected first.. need to order so consistently L then R - I instead just took absolute value in correct place
- Zoom bug: ensure object stays centered while zooming
- Two hand bug: two hands at first does not work
- Separate out the functions for zoom / pan / rotate / etc.
- Add a smoothing function
- Fix the MIN_WAITING_FRAMES
- Track both left and right hand openness separately
-
Ask user which camera to use and save it (see next)
-
Create a default settings/preferences file for python app
-
Add keyboard interaction
-
Allow toggling between selfie and no selfie mode (currently only disabling works)
-
Add voice interaction
-
Clean up PyQT applications
- Create an executable file
- Create a default settings file for QT app
-
Interact with CAD (multi-object) files
- Allow for "pick and place" of parts
- Use exploded view
-
View cross-sections by using hand as slider and plane definer (WIP)
-
Create a dependency graph for the project. (At least a requirements.txt file)