Submission: PDF documentation (Primary resource) + .py script + demo video link
Objective
Design and implement a standalone desktop computer vision tool using OpenCV and Tkinter. The tool must provide a graphical user interface (GUI) that allows users to open a local image or access the webcam, apply image processing and vision operations,** adjust parameters interactively**, and save the output. The emphasis is on correct implementation, usability, parameter control, and clear documentation.
Functional Requirements (What your tool must do)
A. Input & I/O
Open Local Image
i) File Open (Tkinter menu), ii) Supported formats: JPG, PNG, BMP (at least)
Access Live Webcam
i) File Access Live Webcam, ii) Display live feed in the main window, iii) Take Snapshot button to freeze a frame and switch to static-image mode
Save Output
i) File Save As, ii) Save the currently displayed image (processed result)
B. GUI & Interaction
Implement** at least** the following GUI features:
- Menu bar (File / Tools)
- Buttons (e.g., Apply, Snapshot)
- Trackbars / sliders for real-time parameter control
- Text box for numeric input (with validation)
- Clean exit and proper resource release (camera, windows)
C. Image Processing & Vision Operations (Choose 10 total, with constraints)
You must implement at least 10 operations overall, satisfying the minimum coverage below. You may implement more for bonus credit.
1. Foundations & Color (Choose 2)
- RGB channel access and manipulation
- Grayscale conversion
- Brightness / contrast adjustment
- HSV (Hue/Saturation/Intensity) adjustment
- Color blindness simulation (matrix-based)
2. Image Statistics (Implement all)
- Histogram computation and display
- Histogram equalization
3. Point & Local Operators (Choose 2)
- Contrast stretching
- Median filter (kernel size via text box)
- Gaussian smoothing (kernel size + )
- Sharpening (Laplacian or custom kernels)
4. Edge Detection (Choose 2)
-** Sobel** (kernel size + threshold defined as a ratio of mean)
- Canny (control t1, t2, aperture size)
- Laplacian of Gaussian (LoG) (, kernel size, threshold)
5. Segmentation (Choose 2)
- Global thresholding
- Adaptive thresholding (block size, method, etc.)
- Contour detection
- OR one advanced method:
- Mean Shift (sp, sr, pyramid levels)
- Superpixels (SLIC: n_segments, compactness)
Implementation Rules:
##** i) Python only, **
##** ii) OpenCV + Tkinter only (no PyQt, no web frameworks)**
##** iii) Must run locally (not Colab), your tool will be tested by the instructor via calling your script from terminal **
- AI tools may be used, but **you must understand and explain your code in the technical interview **
- Code must be well-structured (functions/classes, not one giant script)
Your tool should handle edge cases (such as no image loaded, invalid parameters). These should not cause a crash; instead, the user should be informed (for example: Please load an image before running the function) ) and when necessary parameter values should be set to default (when they are set outside of acceptable ranges (e.g. kernel size = -1))
Deliverables:
IMPORTANT: In your submission, you should upload the PDF as primary resource and a zip file containing both the PDF and the script as a secondary resource.
1.** PDF Documentation (36 pages): Primary resource to upload**
Your PDF must include:
- **Overview of the Tool **
- A short description of the main features of the tool
- **List of Implemented Functionalities **
- One subsection per operation
- **Parameters Table **
- For each operation:
- Parameter name
- Type (slider/textbox/menu)
- Valid range
- Effect on output
- For each operation:
- **GUI Screenshots **
- Annotated where appropriate
2. Python Script (****.py**): To pack together with the PDF file and upload as a secondary resource **
Your script must:
- Launch the GUI
- Support image and webcam modes
- Provide interactive controls for selected operations
- Save outputs correctly
- Be readable and commented
**File name example: **cv_gui_assignment1_.py
3. Video Demonstration (24 minutes): To include in the first page of the PDF
Include** link** (Google Drive) in your PDF (at the beginning of the) document showing the following user operations:
- Opening an image
- Using **at least 4 different operations **
- Adjusting parameters live
- Accessing webcam and taking a snapshot
- Saving the output
Bonus (Optional)
- Side-by-side original vs processed view
- Preset buttons (e.g., Low Noise, Strong Edges)
- Keyboard shortcuts
Academic Integrity
This is an individual assignment. You may discuss ideas, but code and documentation must be your own. Any AI-assisted code must be understood and defensible during evaluation.
If this description overwhelmed you, consider doing the following list of first (sample implementation for task11 is available ).
Requirements: In Depth with ALL Details and Steps

Leave a Reply
You must be logged in to post a comment.