Skip to content

Instantly share code, notes, and snippets.

@rutvij2292
Last active September 26, 2025 11:57
Show Gist options
  • Save rutvij2292/1b235dc19ed31ac6b90e6c8fbd4ebc2d to your computer and use it in GitHub Desktop.
Save rutvij2292/1b235dc19ed31ac6b90e6c8fbd4ebc2d to your computer and use it in GitHub Desktop.
AI-powered Laravel controller to extract structured attendance data (JSON) from images

Extract Attendance Data from Image (Laravel + Groq API)

This gist contains a Laravel controller that processes an uploaded attendance sheet image (JPEG/PNG/etc.) and converts it into a clean, normalized JSON format using an AI model.

✨ Features

  • Validates uploaded attendance image (JPEG, PNG, JPG, GIF, SVG).
  • Converts image to Base64 and sends to Groq API for processing.
  • Applies a strict parsing & normalization prompt to ensure consistent output.
  • Handles:
    • Date normalization (DD/MM/YYYY format).
    • Time formatting & zero-padding (HH:mm).
    • Range values (e.g., "9 to 18"in: 09:00, out: 18:00).
    • AM/PM → 24-hour conversion.
    • Missing or empty values (set to null).
    • Out-time adjustment (e.g., 9:3221:32).
  • Returns JSON only, grouped by device_id.

📦 Example Output

{
  "1234": [
    { "date": "01/09/2025", "in": "08:00", "out": "21:15" },
    { "date": "02/09/2025", "in": "07:55", "out": null }
  ]
}
<?php
namespace App\Http\Controllers\Attendance;
use App\Http\Controllers\Controller;
use Illuminate\Http\Request;
use Illuminate\Support\Facades\Http;
use Illuminate\Support\Facades\Log;
class ExtractAttendanceFromImageController extends Controller
{
public function __invoke(Request $request)
{
$request->validate([
'attendance_image' => 'required|image|mimes:jpeg,png,jpg,gif,svg|max:2048',
]);
try {
$image = $request->file('attendance_image');
$base64Image = base64_encode($image->get());
$imageMimeType = $image->getMimeType();
$systemPrompt = <<<PROMPT
You are given an image containing tabular data with columns: device_id, entry_date, in_time, out_time.
Extract the data and return ONLY a valid JSON object (no explanations, no extra text).
OUTPUT SCHEMA
{
"<device_id>": [
{ "date": "DD/MM/YYYY", "in": "HH:mm" | null, "out": "HH:mm" | null },
...
],
...
}
PARSING & NORMALIZATION RULES (APPLY IN THIS ORDER)
1) Trim whitespace from all cells. Treat empty or blank cells as "-".
2) Normalize the entry_date to "DD/MM/YYYY".
3) RANGE HANDLING: If any time cell contains a range like "X to Y" (case-insensitive):
- Split on "to". Let X = first part, Y = second part (trimmed).
- Use X for "in" ONLY if the in_time column is "-" or empty; otherwise ignore X.
- Use Y for "out" (unless Y is "-" or empty).
4) MISSING TIMES: After ranges are handled, if in_time == "-" then "in": null. If out_time == "-" then "out": null.
5) CLEANUP BEFORE PARSING:
- Replace '.' with ':' in times (e.g., "5.12" -> "5:12").
- Remove suffixes like "hrs", "h" (but keep AM/PM if present).
- Accept forms: "H", "HH", "H:mm", "HH:mm", "H.mm", with or without AM/PM.
6) TIME PARSING & ZERO-PADDING:
- Parse hour and minute; if minutes missing add ":00" (e.g., "9" -> "09:00").
- Zero-pad to "HH:mm" (e.g., "8:5" -> "08:05").
- If AM/PM present → convert to 24-hour (e.g., "10:15 PM" -> "22:15").
- If parsed hour >= 13 → keep as-is (already 24-hour).
7) OUT-TIME RULE (CRITICAL):
- If out_time has no AM/PM and parsed hour is 1–11 (including cases like 9, 9:15, 09:32, 08:45, 10:00, 11:59) → ADD 12 hours.
(e.g., "9:32" -> "21:32", "9:15" -> "21:15", "08:45" -> "20:45", "10:00" -> "22:00").
- If out_time is "12:xx" with no AM/PM → keep as "12:xx".
- If parsed hour >= 13 → keep as-is.
- This also applies to the Y part of "X to Y" ranges.
8) IN-TIME RULE:
- If in_time has no AM/PM → assume it is already 24-hour, do NOT add 12.
- Exception: if in_time was filled from a range’s X because the in_time column was missing, parse X as-is.
9) FINAL OUTPUT:
- All times must be strings in "HH:mm" or JSON null.
- Dates must be "DD/MM/YYYY".
- Group entries by device_id.
- Return a single valid JSON object only (keys in double quotes, no trailing commas, no extra text).
EXAMPLES (GUIDANCE ONLY — DO NOT INCLUDE IN OUTPUT)
- in="08:00", out="9:15" → "in":"08:00", "out":"21:15"
- in="08:00", out="09:32" → "in":"08:00", "out":"21:32"
- in="08:00", out="08:45" → "in":"08:00", "out":"20:45"
- in="08:00", out="10:00" → "in":"08:00", "out":"22:00"
- in="08:00", out="11:59" → "in":"08:00", "out":"23:59"
- in="09:05 AM", out="6:30 PM" → "in":"09:05", "out":"18:30"
- in="07:55", out="-" → "in":"07:55", "out":null
IMPORTANT: Return ONLY the JSON object, nothing else.
PROMPT;
$apiKey = config('llm.providers.groq.api_key');
$apiUrl = config('llm.providers.groq.api_url');
$model = config('llm.providers.groq.model');
if (!$apiKey || !$apiUrl || !$model) {
return response()->json(['message' => 'Groq API credentials are not configured properly.'], 500);
}
$response = Http::withHeaders([
'Authorization' => 'Bearer ' . $apiKey,
'Content-Type' => 'application/json',
])->post($apiUrl, [
'model' => $model,
'messages' => [
[
'role' => 'system',
'content' => $systemPrompt,
],
[
'role' => 'user',
'content' => [
[
'type' => 'text',
'text' => 'Extract the attendance data from this image.',
],
[
'type' => 'image_url',
'image_url' => [
'url' => "data:{$imageMimeType};base64,{$base64Image}",
],
],
],
],
],
'temperature' => 0,
'max_tokens' => 4096,
'top_p' => 1,
'stream' => false,
'response_format' => ['type' => 'json_object'],
]);
if ($response->failed()) {
Log::error('Groq API request failed', [
'status' => $response->status(),
'body' => $response->body(),
]);
return response()->json(['message' => 'Failed to process image with AI model.'], $response->status());
}
$completion = $response->json('choices.0.message.content');
return response()->json(json_decode($completion, true));
} catch (\Exception $e) {
Log::error('Error processing attendance image: ' . $e->getMessage());
return response()->json(['message' => 'An internal server error occurred.'], 500);
}
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment