Practical AI integration. Not hype.
We infuse AI across our products and our clients' systems where it actually changes the outcome. We also turn it down when a deterministic rule works better. Here's how we decide and what we build.
We use what we recommend
The ils-review-bot reviews our own pull requests every day. It catches what humans miss, escalates what it can't decide, and runs on Groq's free tier for cost-effectiveness.
ils-review-bot
NestJS app integrated with GitHub webhooks (Octokit) and Groq LLMs. Auto-reviews PRs when added as reviewer, posts formatted markdown comments, and nudges humans when AI confidence is low.
$ npm run start:prod
[Nest] LOG Starting Nest application...
[Nest] LOG GroqService initialized (llama-3.3-70b-versatile)
[Nest] LOG GitHubService initialized
[Nest] LOG ✓ Listening on :3000
[webhook] PR #142 review_requested → bolade-akinniyi
[review] Fetching diff for ils/nexus-crm#142
[review] Sending to Groq (1,847 tokens)
[review] Confidence: 0.91 ✓
[github] Posting review comment...
[done] Review posted. Suggestions: 3.AI for the work that matters
Four categories where we've consistently seen AI move the needle on cost, speed, or capability.
Document processing automation
Invoice extraction, contract analysis, KYC document validation. We integrate LLMs into the pipelines where manual review used to be the bottleneck.
Intelligent search & retrieval
Semantic search using pgvector and embeddings. Natural-language queries over enterprise data — "find customers at churn risk who paid late twice."
Agentic workflows
LLM-driven automation that completes multi-step tasks: triage tickets, generate response drafts, schedule follow-ups, route exceptions.
Product features
Smart reply suggestions, sentiment analysis, lead scoring, summarization. Built into Nexus CRM and offered as patterns for client products.
When AI actually adds value
Four questions we ask before recommending AI for a problem. Pass all four and we build. Fail any and we propose something simpler.
Is there a clear, repeatable judgment task?
AI fits when the task is judgment-heavy and rules-based logic falls short.
If a deterministic rule works, use the rule. Don't pay LLM costs for things a regex solves.
Is there enough domain data to validate output?
Good ground-truth data lets us measure AI accuracy and improve it.
Without measurement, AI is a black box. Start with data, not models.
Are humans in the loop for high-stakes decisions?
AI suggests, humans decide. Confidence scores trigger human review when low.
Full automation on high-stakes decisions is where AI gets companies in trouble.
Will it pay for itself within reasonable time?
Time saved or value created should exceed inference costs by a clear margin.
AI for AI's sake is expensive marketing. We model the cost-benefit before we build.
Tools we use
We pick AI providers based on cost, latency, and fit — not brand loyalty. Most of our internal tooling runs on free or low-cost tiers.
Groq
Default for internal tools and high-throughput workloads. Fast inference, generous free tier, llama-3.3-70b-versatile.
OpenAI / Anthropic
When the task needs frontier reasoning quality. Used selectively where performance justifies the cost.
Open-source models
Self-hosted Llama, Mistral, or specialized models when privacy, cost, or compliance requires on-prem inference.
Have an AI use case in mind?
Tell us the problem. We'll tell you honestly whether AI is the right tool — and if it is, what we'd build.
Discuss AI integration