Belitsoft > Technology & Software Development News

Technology & Software Development News

-

Recommended posts

Belitsoft Blog for Entrepreneurs
Legal AI: Judge Fines Attorneys for AI-Generated Motion
Legal AI: Judge Fines Attorneys for AI-Generated Motion
MyPillow CEO's Lawyers in Trouble for Filing AI-Generated Legal Brief with Fake Citations Eric Coomer, formerly director of product strategy and security at Dominion Voting Systems, has brought a civil case in the U.S. District Court for the District of Colorado against Mike Lindell, Lindell’s media site FrankSpeech, and his retail company MyPillow. The complaint says the defendants claimed nationwide fraud in the 2020 election but provided no documents, data, or expert analysis to substantiate the charge. It also says Lindell called Coomer a traitor even though no evidence of criminal conduct was produced. According to the filing, FrankSpeech posted interviews and articles that repeated the allegations alongside advertisements for MyPillow products. Coomer, who works in election system security, alleges that the statements have harmed his professional reputation. The Motion in Limine Dispute Under the trial preparation schedule, each side had to file any motions in limine by a set deadline. Coomer’s lawyers filed "Plaintiff’s Motion in Limine", listing the specific topics — traffic accident, sexual history, substance use, religion, and politics — that they wanted excluded from the jury. Once that motion was filed, it became part of the court record, so the judge and opposing lawyers could read it. Two weeks later, Mike Lindell and the other defendants filed "Defendants’ Brief in Opposition", arguing that the same information should be admitted. The AI-Generated Brief and Citation Errors A lawyer for MyPillow and Mike Lindell relied on an autonomous large language model to prepare the opposition. The draft was submitted to the court without the customary manual check of its citations. Judge Nina Wang’s review identified almost thirty citation problems, which fell into several groups. One set involved quotations that did not appear in the opinions cited. A second group assigned legal rules to authorities that never discussed those principles. A third group mislabeled non-controlling decisions as precedent. Additional errors placed opinions in the wrong judicial districts, and several references pointed to cases that do not exist in any reporter or database. Attorney Explanations Lead counsel Christopher Kachouroff told the court that he used a large language model text generator to draft the opposition brief, noting that the tool produces prose and citations without verifying their accuracy. When questioned about the brief’s false or imprecise citations, he gave no specific reason and referred to the filing as a "draft pleading", even though it had been submitted as final. He admitted he did not compare the citations with the underlying opinions, thereby omitting the standard citation-checking step. Kachouroff said he first created an outline and partial draft himself before employing the AI system to expand the text, but the judge has already expressed doubt about that. When the judge pointed out a quotation that did not match the cited opinion, he called it an accidental paraphrase, denied any intent to mislead, and said he had assigned the citation-checking task to his co-counsel, Jennifer DeMaster. Judge Wang's Response On June 8, it was reported that Judge Nina Y. Wang fined attorneys Christopher Kachouroff and Jennifer DeMaster $3,000 each for submitting an opposition brief that included unverified, AI-generated citations in a defamation case against Mike Lindell. The court said the lawyers violated Federal Rule 11 because the brief contained invented cases, misquotations, and legal arguments not grounded in current law. Lawyers Warned Against Presenting AI-Generated Material Containing False Information in Legal Arguments The Problem with AI-Generated Legal Briefs Courts require lawyers to confirm every citation, meaning attorneys must independently verify that each referenced case, statute, regulation, or quotation actually exists, is cited with the correct volume and page numbers, and still represents good law. When lawyers file briefs drafted by large language model software without performing that verification step, they expose themselves to the possibility that the model has fabricated ("hallucinated") authorities or misquoted real ones, because the software predicts text rather than consulting reliable legal databases. Citing nonexistent or inaccurately described sources can mislead judges — who rely on briefs for accurate statements of the law — and confuse or unfairly disadvantage opposing counsel, potentially leading to sanctions or a loss of credibility for the lawyer and the client. Understanding Language Models and Their Limitations Large language models are computer programs that predict text by running probability calculations on billions of sentences they absorbed during training, selecting the word that is most likely to follow the user’s prompt. Because they merely apply statistical rules and never form intentions or understand meaning, they are not autonomous or self-aware artificial intelligence. When a question concerns material that was scarce or absent in the training data, the model will often invent plausible-sounding details — such as article titles, authors, or page numbers — to fill the gap, a phenomenon sometimes called "hallucination". The same statistical machinery can accelerate routine writing chores: it can draft an email, condense a report into a brief summary, or suggest alternative wording within seconds, yet it cannot independently diagnose a technical fault, devise a legal strategy, or build an investment plan.  The human user therefore sets the question, evaluates whether the answer is sensible, and decides what, if anything, to trust. Because the software manipulates patterns of words without grasping the underlying facts or their consequences, every response is merely a provisional draft that must be checked against reliable sources — a safeguard that is vital wherever errors could expose someone to legal liability, clinical harm, or financial loss. Ethical and Legal Implications Ethics rules require honesty toward the tribunal and reasonable diligence. Therefore, filing invented authority may constitute fraud if the lawyer knew the material was false, or negligence if the lawyer failed to exercise the care that the profession demands. The classification depends on what the lawyer knew or reasonably should have known at the time of filing. Providing a court with false AI-generated material can be treated as the functional equivalent of perjury because it supplies information that the lawyer knows is untrue. On this view, criminal penalties — including incarceration — should supplement traditional civil or professional sanctions. Professional Duties and Verification Requirements Lawyers must subject every machine-generated sentence to the same rigorous scrutiny they apply to a junior associate's draft. That entails verifying each citation, cross-checking factual assertions against the record, and ensuring compliance with local procedural rules and professional conduct standards — regardless of whether the AI produces a full brief, an outline, or a single paragraph. If a lawyer files work that has not been verified, that omission amounts to professional negligence. Verification requires looking up authorities, cross-checking numbers, and confirming quoted language. Skipping these steps breaches the duty of competence and can justify dismissal from the firm or formal discipline by the bar, which may include loss of the license to practice. The use of AI does not lessen a lawyer's duties. Every fact and citation must still be independently verified, and the combined penalties apply to any lapse. Legal AI Market Overview Market Growth and Adoption In 2025, the legal technology market continues to expand. Global revenue for legal AI software is expected to reach between USD 2.3 billion and USD 2.8 billion by year-end, up from about USD 1.9 billion in 2024. Market researchers forecast annual growth of 15 to 20 percent over the next five years, reflecting consistent double-digit adoption. Growth is being driven by the volume and complexity of legal data, pressure on legal departments to control costs, and executive support for automation. Surveys from Deloitte show that more than two-thirds of organizations plan to increase spending on generative AI in 2025, and internal legal teams identify AI tools as a practical route to efficiency. Generative AI is now a normal part of legal work. About 85 percent of lawyers use these tools every week. They help draft contracts, summarize cases and client files, and explain legal issues in simple terms.  A Thomson Reuters survey says 77 percent of legal professionals think AI will reshape their jobs within five years, and half of firm leaders say rolling out AI is their top priority. Data show lawyers save roughly four hours a week thanks to these tools. Leading Legal AI Companies Harvey in the United States supplies a chat-based assistant trained on each firm’s data. By April 2025, it served lawyers at eight of the ten highest-grossing U.S. firms, with annual recurring revenue above USD 70 million. Luminance in the United Kingdom offers contract analytics software and entered 2025 with more than 700 clients in over 70 countries. Legora in Sweden delivers research and drafting tools and has 250 law firm customers across 20 markets. Eudia in the United States builds private AI agents for in-house departments. Supio and Eve focus on plaintiff side automation Paxton serves small and mid sized European practices Theo AI predicts litigation outcomes for lawyers and funders Marveri accelerates due diligence document review. Venture Investment Total so far in 2025: Legal tech fundraising has already topped $1 billion. Seed rounds: Marveri $3.5 million and Theo AI $6.4 million. Series A: Eudia $105 million (led by General Catalyst) and Eve $47 million (led by Andreessen Horowitz). Series B: Legora $80 million and Supio $60 million. Later stages: Luminance $75 million in February 2025. Harvey: back-to-back $300 million Series D and Series E rounds, now valued at about $5 billion (June 2025). Key Applications AI contract review Tools like Luminance read large batches of contracts, point out clauses that do not match the usual wording, and extract key data for contract management software.   In the last two years, Luminance has gained five times more customers and grown its recurring revenue sixfold. The newest versions can propose edits or negotiate straightforward agreements on their own by checking each term against the firm’s standard playbook. Legal research Westlaw and Lexis+ AI now let lawyers type questions in everyday language and receive instant AI-generated summaries.  With these tools, lawyers can gather and verify case law in minutes, and citations are checked automatically. Compliance Monitoring AI services track regulatory changes, map requirements to internal policies, and identify gaps for remedial action. Demand is strongest among corporate legal departments that need to manage privacy, trade, and sector-specific rules without expanding headcount. Litigation Support Litigation teams increasingly rely on predictive analytics. Platforms analyze previous judgments, docket data, and judge-specific patterns to estimate the probability of success and inform settlement strategy. AI-enabled e-discovery systems classify and prioritize documents, while generative models produce summaries of testimony and correspondence. Bottom Line Why do AI-generated "hallucination" briefs keep showing up — even though the legal-tech market is crowded with AI startups? Lawyers often rely on free, general-purpose chatbots instead of paid legal tools.  Under deadline pressure, some practitioners skip manual citation checks, and because many start-ups focus on contracts, e-discovery, or practice management rather than brief validation, lawyers still must run a separate citation checker. Cost is a barrier: platforms like Harvey, Lexis+ AI, and Luminance are priced for enterprises, so solo and small-firm practitioners often default to no-cost alternatives. Legal assistants  require onboarding steps and Word integrations, while a public chatbot is available in one browser tab.
Dmitry Baraishuk • 7 min read
AI in Fintech and Banking
AI in Fintech and Banking
AI's Projected Impact on Banking by 2030 Banking industry observers are analyzing a series of recent publications that quantify and describe the influence of artificial intelligence systems on financial services work.  One study prepared by ThoughtLinks projects that banking as a whole could see roughly 40 percent of current activity redefined by 2030. To reach this estimate, the ThoughtLinks team mapped close to 5,000 individual banking processes and assessed the susceptibility of each process to automation, resequencing, elimination, or redesign.  The research indicates that tasks performed by staff in technology, engineering, and infrastructure functions may be 55 percent redefined by 2030, whereas work performed within commercial banking franchises may be altered by about 49 percent.  Wealth management roles show a projected 42 percent redefinition rate, and investment banking roles about 33 percent. Major Bank AI Deployments Already Underway Large firms have already begun significant deployments.  JPMorgan has rolled out an internal large language model suite to approximately 200,000 employees.  Goldman Sachs has introduced an internal assistant branded as GS AI Assistant.  Citigroup has appointed a group-wide leadership team to guide artificial intelligence strategy for nearly 250,000 employees.  ThoughtLinks stresses that its percentages describe the proportion of work activities that will change, not the headcount that will disappear. In its definition, "redefined" means that the process concerned will incorporate an AI component for automation or redesign.  Sumeet Chabria, former technology and operations chief operating officer at Bank of America and now leading ThoughtLinks, argues that decomposing roles into task-level elements is required before reskilling can proceed efficiently. Commercial Banking Transformation Within commercial banking, selected capabilities are already operational.  First-generation advisory copilots summarize client files, draft memoranda, and flag policy exceptions. Manual exercises such as spreadsheet construction, basic email drafting, and navigation through legacy systems are steadily being replaced by automated routines.  Commercial clients can access virtual assistants that deliver personalized insights and complete routine service requests.  Looking to 2030, generative models are expected to guide onboarding interviews, verify forms, and perform rule-based risk assessments.  Banks plan to apply machine learning techniques to small business credit decisions, thereby widening credit access. Pricing of loans, fee structures, and product terms are also expected to adjust dynamically as behavior, financial patterns, and market conditions evolve.  Continuous monitoring will support breach detection and real-time alerting.  Lending to large corporates, however, will continue to rely on human credit committees and board oversight. Legal, tax, risk management, and structuring divisions will remain integral to the process. Investment Banking Digitization Investment banking activity is already experiencing digitization.  Generative systems draft prospectuses and pitch books within minutes by aggregating market data, precedent transactions, and brand-compliant templates.  Internal copilots produce instant digests of earnings calls, analyst reports, and client financial statements. Language model utilities check documentation for missing disclosures and summarize regulatory amendments.  By 2030, institutions intend to model investor demand and pricing scenarios for equity or debt offerings algorithmically, while leaving final allocations to human syndicate managers. Separate optimization engines are expected to test thousands of capital structure permutations, adjusting debt proportions, equity components, coupon levels, and covenant packages to propose balanced terms for clients.  Syndicate desks will still decide final price points, relying on market knowledge and real-time investor feedback. Relationship building and senior executive advisory assignments will remain person-to-person endeavors. Wealth Management Evolution In wealth management, advisers use copilots that answer factual queries, assemble meeting preparation documents, and summarize full portfolios within seconds. Financial planning engines build personalized plans by modeling life events and risk preferences without starting from a blank template.  Reports to clients now include automatically generated commentary specific to each portfolio.  By 2030, tax optimization routines are scheduled to operate more frequently and with greater precision, while advice and portfolio allocations will be tuned continuously to individual behavior.  Some clients are expected to run their own portfolios by configuring "smart triggers." Even so, human advisers will continue to provide empathy during major market downturns or personal disruptions, and regulators insist that fiduciary responsibility stays with the adviser rather than the software. Rise of Agentic AI in Finance Operations Sidetrade reports that Agentic AI has moved into production over the past twelve to eighteen months and is now being actively applied to order-to-cash activities. In this setting, an agentic system is defined as software that can independently set sub-goals, plan the steps needed to achieve them, and execute those steps without ongoing human oversight. Sidetrade describes the technology as a means of enhancing finance team productivity throughout the entire order-to-cash cycle. Current deployments of Agentic AI can place thousands of personalized outbound collection calls per day, dynamically adjusting language and tone to fit each situation, and escalating only the highest-risk cases to human collectors. The same platform identifies missing remittance information, making traditional match-rate metrics unnecessary, and automatically logs every promise-to-pay date. Natural language models are used to classify incoming emails, detect sentiment, assign dispute codes, and initiate the relevant workflows. Sidetrade measures the overall result as roughly a fifty percent reduction in manual touchpoints while still providing complete coverage of long-tail debtor accounts and improving days-sales-outstanding. The company states that successful implementation depends on four core conditions: high-quality data, systems prepared for integration, disciplined change management, and certified security controls. Sidetrade also points out that traditional rule-based workflow bots often fail when faced with rising exception volumes, while agentic systems can re-plan and maintain operations. As a result, finance team roles are shifting toward oversight of exception cases, with collectors gaining greater confidence as they observe the software consistently adhering to established policy. Canadian Banking AI Survey: Internal vs. Customer-Facing Returns Research from GFT Canada surveys more than 200 information technology decision makers in Canadian banking and finds that 99 percent are prioritizing customer-facing artificial intelligence tools, with 68 percent specifically targeting customer service. Nevertheless, only 32 percent report significant return on investment from those tools, whereas 68 percent consider AI a clear value driver for internal processes. Banks presently allocate around 35 percent of total IT expenditure to AI and expect that outlay to rise by 20 percent over five years. Fraud detection and cybersecurity monitoring are the areas where 45 percent of respondents already see benefits. Front-office investment banking divisions report intensive customer service deployments, with 76 percent adopting AI for that purpose and 42 percent experimenting with personalization, yet only 26 percent describe meaningful returns from customer support automation and none from personalization. One-third of respondents have introduced AI to internal operations, and 58 percent of that subset confirm that back-office capabilities generate the strongest value.  Among retail banks, 67 percent invest in customer experience AI, but only 18 percent report measurable returns, while almost two-thirds achieve significant gains from cybersecurity monitoring and administrative automation. The survey concludes that operational improvements, rather than public-facing applications, drive competitive performance. Challenges and Risks of Autonomous AI Systems An analysis by FutureCFO highlights potential challenges with Agentic AI. It notes that autonomous systems executing at high speed can magnify systemic risk in periods of market volatility or during cyberattacks. For activities such as trading or investment advice, future European Union rules may classify such applications as high-risk under the EU AI Act, and counterparty credit reporting may have to move from daily or weekly intervals to real time.  The article also warns that agentic systems handling confidential data can leak information, make mistakes, or behave unethically, leaving institutions liable. It predicts that banks will proceed carefully when scaling because they remain accountable for agent actions. Implementation Barriers and Budget Expectations A GFT press release reiterates that 99 percent of banks remain focused on consumer-oriented AI, but only 32 percent have realized material returns in that domain, whereas 68 percent find the greatest value internally. Fraud detection and cybersecurity automation again emerge as the principal benefit, cited by 45 percent of respondents.  Institutions expect to raise AI budgets by 20 percent over five years. Key barriers include cybersecurity risk at 49.5 percent, data privacy constraints at 37.5 percent, implementation cost at 32.5 percent, shortage of skilled personnel at 29 percent, legacy system complexity at 27 percent, and unclear return metrics at 21 percent. Generative AI in Finance Operations Outsourcing Generative AI is entering finance operations outsourcing. Corporate boards increasingly request demonstrable value, and chief financial officers are under pressure to show tangible results. Organizations are employing generative models in dozens of use cases, either purchasing off-the-shelf products or building their own. Many executive teams remain uncertain about the best platform architecture, the talent required, and the appropriate data governance mechanisms, prompting them to rely on service providers.  Deloitte reports that its finance and accounting Operate services use generative AI to automate forecasting, invoice processing, and collections to move closer to a touchless financial close. The firm states that implementing generative solutions at scale is more complex and potentially more expensive than many owners assume, and that AI should be viewed as part of a comprehensive technology stack rather than a universal remedy. It identifies additional applications, including smart reconciliation, variance analysis, task management, and dynamic risk assessment.  Deloitte sets out eight guiding principles covering early collaboration, technology roadmap audits, selection between generative AI and robotic process automation, data readiness, ROI-based opportunity mapping for short- and long-run horizons, rigorous risk assessment, proof-of-concept execution, and governance discipline.  Strong compliance frameworks are critical, and providers must supply specialist personnel to validate and monitor AI systems. Deloitte offers an example: its PrecisionView forecasting product achieved 99.6 percent accuracy during the first two-year horizon for unit sales forecasts. Path to Autonomous Banking Bloomberg Intelligence reports that agentic AI is now delivering measurable workflow automation across selected banking functions and could generate productivity gains greater than those forecast for late-2024 generative AI pilots. Full autonomy remains at least five years away, because banks must first resolve data governance gaps, modernize legacy platforms, and secure regulatory clearance. Agentic systems already perform end-to-end tasks such as customer query resolution, account optimization, and straight-through transaction processing, but they run only where institutions have installed new data layers, resilient orchestration frameworks, and integrations with core banking software. A recent Bloomberg survey shows mixed cost expectations: almost 50 percent of banks anticipate lower operating costs within three to five years, whereas over 40 percent expect higher costs, and 15 percent project increases above ten percentage points. Commerzbank provides an early benchmark. The bank plans to spend €140 million on AI programs and forecasts €300 million in benefits, an implied 120 percent ROI that would deliver about 25 percent of its target profit growth to 2028. By contrast, legacy environments still consume roughly 60 percent of a typical bank’s technology budget, limiting near-term deployment capacity. Capital market activity reflects rising interest. Funding for agent platform startups reached $3.8 billion across 162 deals in 2024, nearly triple the 2023 total, and more than 50 percent of those vendors were founded in or after 2023. References to AI agents on listed company earnings calls increased fourfold in the fourth quarter of 2024. Lab-level prototypes now decompose loan approval into discrete agents for data aggregation, credit scoring, risk assessment, decision support, and customer communication, with each step logged for audit. Conference presentations anticipate a shift from single-agent pilots to multi-agent ecosystems overseen by human orchestrators. Enterprise IT groups may absorb an “HR for agents” remit to onboard, monitor, and retire digital workers. Labor market effects appear modest. Respondents predict an average net staff reduction of about 3 percent—about 200,000 positions across the 92-bank sample. Sixty percent of firms expect smaller workforces, with the remainder planning redeployment rather than layoffs. Bloomberg Intelligence concludes that agentic platforms will augment rather than replace most roles, redirecting employees toward higher-value activities once routine execution is fully automated. Five Cross-Cutting Takeaways Internal automation is the low-hanging fruit Every survey in the excerpt shows significant, measurable benefits in fraud detection, cybersecurity, finance operations, and core banking process automation—well ahead of glossy client chatbots. Funding, talent, and pilot projects should follow the areas with the clearest financial impact. Agentic systems shift the conversation from “AI tools” to “digital workforce” Once software agents can re-plan tasks on the fly, banks need lifecycle management: onboarding, performance metrics, escalation rules, and offboarding. Information technology, human resources, and risk management now form a new control triad. Data modernization is now the gating factor AI return on investment is capped by legacy core systems that still consume more than sixty percent of budgets. Banks that have already built unified data layers—such as Commerzbank and select U.S. regional banks—are the ones projecting triple-digit ROI. Others will continue burning cash until they catch up. Human expertise remains the circuit breaker In commercial and investment banking, credit committees, syndicate desks, and senior coverage remain indispensable. In wealth management, empathy during market sell-offs and fiduciary accountability are essential. For risk management, humans are required to interpret outlier events that AI has not encountered. Autonomy will arrive gradually, not all at once. Regulatory clarity will shape adoption speed more than technology itself Real-time counterparty reporting, model registries, and agent “kill switches” will become regulatory standards. Firms that embed compliance into their system architectures early will be able to scale more quickly.
Dzmitry Garbar • 8 min read
Lovable grows to $1.8 B valuation despite security issues
Lovable grows to $1.8 B valuation despite security issues
Its sole product, released in late 2024, is a web platform that converts plain language prompts into production-ready software. Lovable markets the service as a "vibe coding" environment and an "AI Software Engineer" that allows anyone, whether a coder or not, to assemble a complete application or website by describing the desired outcome in natural language. The slogan, "the last piece of software," reflects management’s belief that the tool will eliminate most routine programming work. How Lovable Works A user writes a short prompt, such as a request for a customer feedback portal or a multiplayer quiz game. Lovable then orchestrates several large language models to create the front-end interface, the back-end logic, a database, and the deployment pipeline.  The system stitches together code from Anthropic’s Claude, OpenAI’s GPT series, and Google’s Gemini family, but it relies primarily on Claude Sonnet after internal tests showed that this model is the most reliable. The firm built its own benchmark to measure how often a model "hits a wall" during the build process. Claude scored highest, so it serves as the default engine, while the others act as fallbacks.  The generated project includes two-way GitHub synchronization, Supabase back-end integration, a visual editor, and inline educational snippets that explain what each file does. The stated goal is a twenty-fold productivity gain for professional developers and a zero-to-one path for non-technical founders. Main Use Cases First, founders and product managers use the platform for rapid ideation and proof-of-concept work. Second, small businesses build commercial-grade web products, avoiding the cost of hiring a full development team. Third, larger organizations create internal tools — dashboards, workflow trackers, lightweight CRMs — without adding to backlogs in their main engineering groups. Unusually Strong Early Revenue Within three months of launch, Lovable reported annualized recurring revenue of $17 million and thirty thousand paying customers. Six months in, a public case study by Anthropic quoted $40 million in annual recurring revenue, and by the seventh month, the number had risen to $75 million.  Management also reports that more than one million people use the platform each month. Lovable is "Europe’s fastest-growing company ever" and the data lends some weight to this claim. The market intelligence outlet Sifted ranked Lovable first on its 2025 B2B SaaS Rising 100 list, an annual survey limited to companies valued below one billion dollars. Ambitious Financing Plan Lovable is in the final stages of a round that would raise more than $150 million and imply a valuation of approximately $1.8 billion. Accel is leading the transaction, with prior backers 20VC and Creandum taking pro rata positions.  The firm last tapped the market six months ago, in February 2025, when Creandum led a $15 million pre-Series A round. People involved in the new deal say the target size and headline valuation were increased during negotiations because investor demand was stronger than expected.  Both Lovable and Accel have declined public comment, but the consensus among advisers is that the round will close as scheduled on the stated terms. If completed, the deal would place Lovable among a growing group of multibillion-dollar European AI companies that includes Mistral, Synthesia, DeepL, and Helsing.  Security Concerns The company claims that, on average, the end-to-end build process runs twenty times faster than a traditional development cycle. The speed advantage, however, brings security concerns. A Replit employee published research showing that 170 of 1,645 analyzed Lovable projects exposed personal data, including names, email addresses, financial information, and secret API keys. The root cause was almost always the same: misconfigured Supabase access controls. Because Lovable encourages direct database connections, an error in the rules can leave tables open to the public internet. The vulnerability was entered into the U.S. National Vulnerabilities Database after Lovable’s 45-day remediation window lapsed.  On the social media site X, the company conceded that it is "not yet where we want to be" on security and promised to improve. It also released a built-in scanner that checks whether Supabase access controls are enabled, though critics note that the tool does not verify whether the rules themselves are correct. Security specialists argue that letting novice users attach live databases to public applications revives risks long considered solved in professional engineering.  Insecure vibe coded apps are the single biggest challenge. Former Facebook security chief Alex Stamos says the odds of a beginner configuring permissions correctly are "extremely low." Replit chief executive Amjad Masad adds that any platform making deployment trivial must also make accidental exposure difficult. Analysts draw parallels to the 1990s web era, when dynamic sites multiplied faster than secure coding practices could keep up. The difference now is that attackers can automate reconnaissance with the same AI tools developers use to write code. Lovable’s leadership says community feedback is integral to its roadmap. An active Discord server acts as a bug report channel, feature request board, and informal support forum. Adoption appears strongest among solo entrepreneurs, small founding teams, and product or design professionals looking to build tangible demos without waiting for scarce engineering resources. The company’s documentation emphasizes that human review remains essential for sensitive deployments, and forthcoming releases will add automated penetration tests, linting for security issues, and one-click migration from the built-in database to managed cloud services with stricter defaults. Broader Trend European AI agent start-ups collectively raised €481 million during the first six weeks of 2025, already more than a quarter of the total for all of 2024.  Competition is emerging on both sides of the Atlantic. In the United States, direct competitor Anysphere tripled its valuation to $9 billion after a $900 million Series B in May 2025. The broader coding assistant field now includes Microsoft, OpenAI, Anthropic, Poolside, Bolt, and Replit, while design software providers Figma and Canva are adding generative AI modules of their own. Investors often cite "vibe coding" as one of the most straightforward ways to monetize large language model technology because it maps directly onto established software team budgets. Lovable distinguishes itself by aiming squarely at users with little or no coding background. Where Cursor and other incumbents target professional engineers, Lovable promises true no-code operation. The system reads a text brief, proposes an initial architectural outline, asks follow-up questions if necessary, and then emits a running application with authentication, database tables, and deployment scripts already configured.  2025 as a Pivotal Year Industry observers view 2025 as a pivotal year for AI-driven software creation. Only about one percent of the global population can write code, yet software needs continue to expand. By pushing large language model capabilities into a no-code package, Lovable aims to widen the pool of "builders" and cut iteration time for professionals.  Chief executive Anton Osika proposes that engineers will shift from handcrafting features to integrating pre-built blocks and translating user needs into system specifications. He sees further model advances reducing the manual work still required today, though he also argues that human oversight will remain necessary to assure quality, reliability, and policy compliance. Looking Ahead Lovable plans to channel the new capital into model research, security hardening, and a push into enterprise accounts. Management believes that combining faster build times with reliable guardrails will appeal to corporate teams that must deliver internal tools under tight budgets. Part of the funding will go toward expanding the platform’s library of pre-configured components and industry templates — health records viewers, logistics dashboards, subscription e-commerce stacks — so users can start from semi-finished blueprints rather than blank pages. The firm also intends to deepen its relationship with Anthropic, arguing that specialized fine-tuning could raise reliability further and open the door to domain-specific variants for finance, healthcare, and education. Investors frame Lovable as a bet on the application layer of generative AI rather than on foundational models themselves. While Europe has fewer hyperscale compute clusters than the United States or China, its start-up scene has proved capable of building vertical products on top of global model APIs. Lovable’s challenge is to maintain its early lead in usability and speed while closing the gap in security maturity. If it succeeds, the company may demonstrate that world-class AI software firms can scale rapidly outside Silicon Valley. If it falters, competitors with deeper pockets or stronger security postures will absorb the same demand. Either way, the rapid rise of vibe coding tools suggests that natural language programming is moving from experiment to mainstream practice, and Lovable has placed itself at the center of that transition. The next twelve months will test every dimension of the business. Users will tolerate friction in a young product, but they expect steady evidence that the platform can support production workloads without exposing sensitive data. If Lovable can balance velocity with safety, it will not only shorten build cycles, it could change who gets to participate in software creation and how the industry values the skill of writing code.
Dmitry Baraishuk • 5 min read
Cloudflare Will Charge OpenAI Bots for Scraping Websites
Cloudflare Will Charge OpenAI Bots for Scraping Websites
The mechanism revives HTTP status code 402, "Payment Required," an unused element of the original web specification, and uses it to signal that a charge is due before content is served. Cloudflare functions as the merchant of record, so the publisher does not need to integrate a payment gateway, issue invoices, or reconcile receipts. Cloudflare collects funds from the crawler operator and remits them to the publisher on its normal payout schedule. Deployment In the Cloudflare dashboard, the publisher sets a single price that applies to every request for the entire domain. Next, the publisher assigns one of three actions to each known crawler: "Allow" for full, free access, "Charge" to deliver content only if the correct payment intent is present, and "Block" to deny all requests. If a crawler marked "Charge" does not yet have a Cloudflare billing relationship, the request still receives a 402 response, but no content is returned, and the header informs the caller that payment would grant access if the relationship is established later. All routing decisions run after existing WAF, rate limiting, and bot management policies, so the feature does not interfere with the site’s current security posture. For a crawler operator, participation begins with identity proof. The operator generates an Ed25519 key pair, publishes the public key in JSON Web Key (JWK) format at a known URL, and registers that URL along with the crawler’s user-agent string with Cloudflare. Every request is then signed under the emerging Web Bot Auth standard and carries three headers — Signature-Agent, Signature-Input, and Signature — so the edge can confirm that the message came from the declared crawler and has not been spoofed. Unsigned or malformed requests never proceed to the payment check; they are processed or blocked by the publisher’s existing bot rules as usual. Once a crawler is recognized, payment negotiation follows one of two flows. In the reactive flow, the crawler makes a normal request, the edge returns a 402 status that includes a crawler-price header with the exact charge in US dollars, and the crawler repeats the request with a crawler-exact-price header containing that figure. If the header matches the configured fee and the signature is valid, Cloudflare serves the content with a 200 OK response and logs a billable event. In the proactive flow, the crawler states a maximum acceptable price in a crawler-max-price header on its first attempt. If the site’s configured fee is at or below that ceiling, the content is served immediately, the actual charge is echoed in a crawler-charged header, and the event is logged. If the fee is higher than the crawler’s ceiling, the edge returns 402 with the posted price. Only one price declaration header — either exact or maximum — may appear in a single request; if both are present or if the header is absent on a "Charge" path, the edge responds with 402. Accounting is Automatic Each successful paid response is recorded with the authenticated crawler identity and the amount charged. Cloudflare aggregates these entries, debits the crawler operator’s chosen payment method, and credits the publisher. Because Cloudflare is merchant of record, the publisher sees a single consolidated remittance and does not handle disputes or chargebacks. The workflow is identical whether the site processes a few dozen paid crawls per month or several million. The beta enforces one flat price for the entire site. Cloudflare’s roadmap includes path-level pricing, dynamic fees based on demand or crawler category, and license distinctions for training, inference, or search, but none of these features are live. Exemptions can be added at any time, so a publisher can grant free access to a research crawler while charging commercial models. The feature can be disabled by removing the rule - doing so reverts the site to its previous open or blocked posture without code changes. Pay per Crawl therefore creates a predictable commercial framework for automated content access. It adds no local infrastructure, relies on standard HTTP, uses cryptographic signatures for identity, and integrates billing into Cloudflare’s existing edge platform — giving executives a clear path to monetize crawler traffic without negotiating individual contracts or staffing additional operations. About one fifth of public websites already sit behind Cloudflare, and the company now offers to authenticate web crawlers, negotiate a price through HTTP headers, collect fees, and remit them to site owners. Large publishers such as Condé Nast, TIME, the Associated Press, and others have agreed to block unregistered AI crawlers by default and to rely on Cloudflare for paid access. Crawlers must identify themselves with RFC 9421 cryptographic message signatures, user-agent strings alone no longer enough.  The program exempts Google’s traditional search crawler, reflecting publishers’ continued dependence on Google for traffic. As a result, Google can still train models on its cached pages without direct payment, giving it a competitive cost advantage and reinforcing its market power. Supporters and Critics Supporters argue that charging per crawl will fund infrastructure costs, reduce bot traffic, and prompt the largest AI and search companies to cooperate on shared crawling services instead of each fetching the same pages repeatedly. Critics respond that fees may encourage mass production of AI-generated "slop" designed only to earn crawl revenue, raise barriers for smaller AI startups, and strengthen Cloudflare’s position as a private gatekeeper. Publishers differ in their incentives. Governments, large corporations, and tourism boards often benefit when AI models quote their content, so they may prefer unrestricted crawling.  Lawmakers are starting to look at how copyright and antitrust rules should cover AI training. U.S. courts have often said that using content to train AI counts as "fair use", which weakens publishers’ bargaining power, but the rules are still unclear. Technical fixes help for now, but they won’t solve everything, and full answers will have to wait for new laws or regulations. OpenAI Response to Cloudflare AI Crawler Block OpenAI declined to participate in Cloudflare’s preview program to block AI crawlers by default, saying the measure would introduce an unnecessary intermediary. The Microsoft-backed lab emphasized that it pioneered the use of robots.txt to limit automated data scraping and stated that its own crawlers already honor publishers’ preferences.
Dmitry Baraishuk • 4 min read
Microsoft AI for Health: MAI-DxO Is 4 Times Better at Diagnosis Than Doctors
Microsoft AI for Health: MAI-DxO Is 4 Times Better at Diagnosis Than Doctors
The project has two main pieces. Sequential Diagnosis Benchmark The first is a test set called the Sequential Diagnosis Benchmark, or SDBench. It turns 304 detailed New England Journal of Medicine case reports into step-by-step puzzle scripts. In each script, the decision maker — whether a human doctor or an AI model — receives only a short opening note. The decision maker must then ask questions or order tests one at a time, just as doctors do during a real consultation. A separate "gatekeeper" program releases an answer only if the request is specific. If the original article does not contain a requested lab value, the gatekeeper invents a realistic number so that no one can guess the right diagnosis from missing data. At the end, the test checks whether the final answer matches the article and it adds up the dollar cost of every question and every test. MAI Diagnostic Orchestrator The second piece is a control tool called the MAI Diagnostic Orchestrator, or MAI-DxO. It does not hold medical knowledge of its own. Instead, it tells a modern language model — such as GPT-4o, Gemini, Claude, Llama, or Grok — how to behave like a careful medical team.  Virtual Medical Team Structure The orchestrator splits the job into several virtual "doctors".  One keeps a ranked list of possible diseases.  One chooses the next question or test that should remove the most doubt.  One acts as a skeptic and keeps looking for other explanations.  One watches the running bill.  One checks that every step follows basic safety rules.  The orchestrator repeats this loop until extra tests would not change the top answer. It then delivers a single diagnosis and a summary of what it spent to reach the conclusion. Because the system is model-agnostic, Microsoft can plug in different language models without changing the control logic. Performance Results: AI vs. Human Doctors Microsoft paired the orchestrator with OpenAI’s o3 model for its headline demonstration. Under those conditions, the AI reached the correct diagnosis in roughly 85 percent of the SDBench cases described in the press material and about 80 percent in the formal paper.  A comparison group of 21 experienced physicians, drawn from the United States and the United Kingdom and averaging 12 years in practice, solved about 20 percent of the same problems.  Those doctors were not allowed to use online references or language models, a rule meant to keep the playing field level with the AI, which also had to rely only on information revealed by the gatekeeper. Cost Analysis: Efficiency Through Orchestration Cost figures point in the same direction. The doctors spent just under $3,000 per case if one counts $300 for every consultation round and standard 2023 U.S. prices for each test. The orchestrated o3 model spent about $2,400. When the researchers ran o3 on its own, without the orchestrator controlling the process, the model asked for many more tests. Its accuracy stayed high, but the bill climbed to nearly $8,000 per case.  Microsoft argues that this spread shows the value of a formal process that forces the model to think in small steps and keep an eye on cost. Study Limitations and Boundaries All SDBench cases are difficult teaching cases, not routine coughs, rashes, or hypertension visits. The benchmark ignores regional price swings, insurance discounts, test wait times, and the discomfort a patient feels during a procedure. The doctors in the trial worked as unassisted generalists, though in real life they would call in specialists for rare conditions. For these reasons, Microsoft labels the work an early proof of concept, not a finished clinical product. The company states that the orchestrator has not yet been used on live patient data outside controlled tests. MAI-DxO on Github As of 7 July 2025, there is still no public GitHub repository for MAI-DxO. Exhaustive searches and direct requests to the obvious URLs (for example, github.com/microsoft/MAI-DxO or github.com/mai-dxo) return a 404, confirming that no repository under that name is visible to the public today. Microsoft’s own launch blog, published in late June, makes it explicit that both the Sequential Diagnosis Benchmark (SDBench) and the MAI Diagnostic Orchestrator are "research demonstrations" and not yet available as public benchmarks or orchestration code. The accompanying arXiv preprint likewise gives no code link—only a promise that SDBench may be released "in the coming weeks", while staying silent about any timeline for open-sourcing the orchestrator itself. In short, the orchestrator almost certainly exists inside Microsoft’s internal repositories, but nothing is cloneable today. If it is eventually open-sourced, it will most likely appear under the microsoft or Azure-Samples organisations (https://github.com/Azure-Samples/healthcare-agent-orchestrator). DxGPT: Real-World Deployment for Rare Diseases Alongside MAI-DxO, Microsoft is promoting another tool, DxGPT, that focuses on rare diseases. The company says DxGPT reaches about 60 percent diagnostic accuracy across all diseases and close to 50 percent for rare disorders, numbers that put it in the same range as a trained clinician.  Current Implementation in Healthcare Systems DxGPT is already running in the Madrid regional health service, where 6,000 doctors may consult it, and the company estimates around 500,000 patients have benefited from its suggestions. DxGPT is available through the Azure Marketplace, which lets hospitals that already use Microsoft’s cloud add the service with limited extra effort. Microsoft's Healthcare Division Both tools are inside a health-specific unit Microsoft created at the end of 2024. Mustafa Suleyman, who co-founded DeepMind and later led the startup Inflection, joined Microsoft that year and now oversees all consumer AI products as well as the health group. Dominic King, a physician and former Google Health executive, serves as vice president.  Their mandate is to merge clinical insight, product design, and model research into tools that improve diagnostic accuracy and lower cost. They repeat in public statements that doctors will remain responsible for treatment plans, patient communication, and ethical accountability. The AI’s role, they say, is to support judgment. OpenAI Partnership A major factor enabling the work is Microsoft’s partnership with OpenAI. From an initial $1 billion stake in 2019, Microsoft’s total commitment has risen to about $14 billion. The contract runs until 2030 but allows OpenAI to leave early if its board declares that it has achieved artificial general intelligence. News reports describe friction over OpenAI’s wish for more commercial freedom, but Suleyman says the alliance remains strong and long-term. Market Opportunity and Growth Projections Analysts expect global spending on AI healthcare applications to rise from roughly $8 billion in 2023 to about $200 billion by 2030, driven mainly by tools that improve diagnosis and trim unnecessary testing. Microsoft views MAI-DxO, DxGPT, and the broader Azure platform as a way to capture a large share of that growth. Company disclosures say revenue from healthcare-related cloud services has more than tripled since 2020. Equity analysts maintain a strong buy consensus on Microsoft stock and see further upside, even after years of outperformance. Clinical Need: Addressing Misdiagnosis and Cost The clinical need is equally large. A 2023 study by the U.S. Agency for Healthcare Research and Quality estimated that American emergency departments misdiagnose about 7.4 million patients each year, with one in 350 cases ending in death or serious disability. At the same time, redundant tests add billions of dollars to national health costs. Microsoft argues that a system able to reach an accurate answer with fewer procedures can reduce both patient harm and wasteful spending. Path to Real-World Implementation Turning research into real-world impact will take time. Microsoft is negotiating with hospital systems to run live trials that feed the orchestrator real electronic health record data under regulatory oversight. Early uses are likely to appear as clinician-facing second-opinion tools. Consumer symptom checkers would follow only after regulators and professional bodies are satisfied that the system is reliable. Because Bing and Copilot already handle about 50 million health-related queries per day, Microsoft could integrate diagnostic suggestions quickly once confidence is high enough. Key Takeaways for C-Suite Executives For a C-suite audience, three messages matter.  First, structured large language model systems can already outperform experienced generalists on complex cases in controlled settings.  Second, an orchestration layer that forces the model to ask targeted questions and watch its own budget prevents runaway costs.  Third, commercial uptake will depend on showing the same gains on typical cases, proving fairness across diverse patient groups, and fitting within existing clinical workflows and liability rules. Microsoft’s size, cloud footprint, and access to frontier models give it a head start, but hospitals, insurers, and regulators will decide how quickly the technology becomes routine care. Future Vision If MAI-DxO and related tools clear those hurdles, they could change how diagnostic work is distributed between humans and software. Doctors would focus on final judgment, complex communication, and the hands-on parts of care, while orchestrated AI systems would handle the exhaustive review of differential diagnoses and the cost-benefit arithmetic behind each test order. That shift would not remove physicians from the loop, but it could let them spend less time on information gathering and more on treatment planning and the human side of medicine.  Whether that potential becomes daily practice will depend on rigorous field trials, transparent error tracking, and clear accountability frameworks. Microsoft’s next steps — live deployments, peer-reviewed validation, and regulatory engagement — will show whether the early performance numbers can survive the complexity of real healthcare environments.
Dzmitry Garbar • 6 min read
Senior Software Developer in the UK is a Top 5 In-Demand Role
Senior Software Developer in the UK is a Top 5 In-Demand Role
In May 2025, UK's job matching platform reported that the UK labor market recorded 858,465 advertised vacancies, 0.49 percent more than in May 2024. This result marked a third straight month in which the year-on-year count of vacancies rose. Average advertised salary increased for the twelfth consecutive month, reaching £42,403 — 0.3 percent higher than in April and 9.38 percent higher than a year earlier. The rise in the statutory National Living Wage by 6.7 percent in April contributed to this pay growth. Entry-level demand moved in the opposite direction. Graduate postings declined by 4.2 percent during May and finished 28.4 percent below their level of May 2024, the lowest reading since July 2020. The broader set of entry-level jobs — which includes apprenticeships, internships, and other junior roles — was 32 percent smaller than in November 2022. Such roles now represent 25 percent of all adverts, two years earlier they made up almost 29 percent. Logistics and warehouse listings rose 9.77 percent in May, creative and design increased 6 percent, hospitality and catering 5.61 percent, and teaching 1.63 percent. Healthcare and nursing vacancies fell 10.21 percent, administration 9.22 percent, maintenance 7.95 percent, domestic help and cleaning 5.72 percent. The average time to fill a vacancy shortened from 39.5 days earlier in 2025 to 35.8 days in May, yet the ratio of jobseekers to vacancies rose from 1.98 to 2.02.  The northeast recorded 3.32 jobseekers per vacancy, the southwest 1.32.  Jobseeker-per-vacancy ratios increased in ten of the UK’s twelve regions. Northern Ireland recorded the fastest annual advertised salary growth at 12.63 percent, lifting the regional average to £40,726. London remained the highest-paying region at £48,680, followed by eastern England at £41,013. The list of most in-demand occupations placed healthcare support workers first for a sixth month, ahead of social care workers and sales assistants, software developers re-entered the top five. The overall outlook is one of cautious optimism but automation may alter hiring patterns for early-career roles. Many link the fall in entry-level hiring to the rapid spread of generative AI tools after November 2022. They say chatbots now handle basic customer support, AI programs draft routine legal documents and review discovery files, and models perform simple data analysis once assigned to junior staff. Automated systems also write standard account management emails. Because these tasks no longer need trainees, firms hire fewer graduates. AI raises the output of experienced workers more than that of newcomers, which could widen pay and career gaps. Others cite different reasons. They note higher borrowing costs after the Bank of England lifted its base rate to 5.25 percent, the end of the post-pandemic hiring surge, and recent increases in National Insurance and other taxes. Several point out that large technology companies are adding junior staff offshore, so some entry-level work may be shifting overseas, not vanishing.  Apprenticeships in skilled trades are down about 30 percent, and vacancies in healthcare, maintenance, and cleaning have also fallen—areas that depend little on language model technology. This suggests that broader economic forces are also at play. Wages are still rising faster than consumer inflation. Specialists, however, warn that the advantage could fade if price growth slows while pay growth continues. Some analysts believe the higher National Insurance charge is already limiting pay deals, but others see little effect so far. Finally, views diverge on whether AI is compressing pay bands: some expect flatter scales and lower inequality, others expect little change. Experiences with AI in software work differ widely. A few senior engineers say language model tools let them turn out routine code, quick prototypes, and data processing scripts two to four times faster. Others count only a 10–20 percent boost once the extra bug fixing is included. Most commenters note that the tools deliver value only when an experienced person guides them step by step. Very large codebases still lie beyond current model limits. When development does speed up, project managers often add more features to the schedule, so total workloads stay high. Users also list several drawbacks. The models can give confident but wrong answers, skip edge cases unless prompted with care, and insert subtle bugs that need thorough checks.  Because of these issues, many remind that a simple link between wider AI use and the fall in entry-level hiring is unproven. Economists say proper causal studies —ones that separate the effects of technology, interest rates, taxes, and other factors — are needed before drawing conclusions. Observed employer behavior aligns with several of these themes. Companies are running informal hiring freezes to preserve cash, posting exploratory vacancies to gauge conditions, and relying on smaller teams equipped with advanced tools to cover broader scopes. Recruiters report mismatches between job descriptions and candidate skills, including requirements for several years of experience with recent technologies. Applicants continue to receive outreach for roles unrelated to their stated competencies.
Alexander Kom • 3 min read
Gemini CLI Free Open-Source AI Agent
Gemini CLI Free Open-Source AI Agent
Availability & onboarding The public preview went live on June 25, 2025. Installation is a single command from GitHub, reportedly takes less than a minute, and needs only an email address. The same repository serves as the hub for bug reports and feature requests. Core capabilities & use cases Once installed, Gemini CLI can draft, modify, or migrate code, explain tricky snippets, generate unit tests, and "vibe code".  It resolves bugs through troubleshooting, executes shell commands in plain English, manipulates files, and can be integrated into scripts to automate pipelines.  Prompts may be grounded with live Google Search results for up-to-date context, and the agent can invoke Imagen for images or Veo for video — so media generation happens without leaving the terminal.  For long-form analysis, it taps Google’s Deep Research agent, and through Model Context Protocol (MCP) servers, it can talk to external databases or services. Model & technical details Under the hood is Gemini 2.5 Pro, Google’s most advanced model for coding and reasoning and capable of a sprawling one million token context window.  Although the agent is local, inference is cloud-hosted. Google says it is not providing on-device model support "today". In the free preview tier, developers may issue up to sixty requests per minute and one thousand per day — limits Google calls the largest in the industry and roughly double its own engineers’ average usage. Platform integration Gemini CLI runs on Windows, macOS, and Linux terminals and shares its core technology with Gemini Code Assist, Google’s IDE extension. Inside VS Code, a new agent mode mirrors CLI behavior, crafting multi-step plans, recovering from failed attempts, and creating brand-new solutions.  Signing in with any personal Google account suffices for the free tier — no credit card or API key required. Usage limits, pricing & licensing Individuals — students, hobbyists, or freelancers — merely sign in with a personal Google account to receive a free Gemini Code Assist license with Gemini 2.5 Pro, the million-token context window, and the 60 per minute/1,000 per day allowance at no charge.  Professional developers who need multiple concurrent agents or bespoke models add a Google AI Studio or Vertex AI key and pay only for what they use.  Organizations that require policy controls, governance, or large-scale parallelism step up to Standard or Enterprise Code Assist plans for a fee.  Google underscores that "for the vast majority of developers, Gemini CLI will be completely free" and notes that users "rarely, if ever, hit a limit," while also conceding it has not promised the tool will remain free after general availability. Open-source stance & extensibility Released under Apache 2.0, Gemini CLI invites full code inspection, security auditing, and community contribution.  Google explicitly "expects and welcomes" global developers to file issues, suggest features, and submit pull requests on GitHub.  Extensibility rests on built-in MCP server support to reach any compliant service, bundled extensions that package MCP servers with configuration, and project-level GEMINI.md files for tailored system prompts.  Settings can be tuned per user or shared across a team, recognizing that the terminal is a deeply personal space. Security & execution safeguards Although model calls leave the machine, the agent runs locally and asks for confirmation before executing every command — choices are "allow once," "always allow," or "deny."  For extra defense, users may leverage macOS Seatbelt sandboxing, run Gemini CLI inside Docker or Podman containers, and channel network traffic through a proxy.  The agent only sees data explicitly supplied in a prompt or referenced path, and open-source transparency allows organizations to audit every line of code. Competitive positioning Google highlights Gemini CLI’s free, open-source nature as a sharp contrast to paid, proprietary rivals such as OpenAI Codex CLI or Anthropic Claude Code.  The generous request limits aim to undercut tools like GitHub Copilot or Microsoft’s Windows Terminal AI assistant, and by hosting the agent itself, Google hopes to forge a direct bond with developers rather than letting third-party tools mediate Gemini access.  Because there is no meter running for most users, Google predicts broader everyday adoption: many developers simply avoid paid tools for casual tasks. Warnings, caveats & external observations Skepticism remains. A 2024 Stack Overflow survey found only 43 percent of developers fully trust AI tools, and several studies show code-generating models can introduce bugs or miss security flaws.  Google reiterates that Gemini CLI does not yet support fully offline use, and it has not spelled out what happens if a user exceeds the free quota or whether the preview-era generosity will persist after general release. Early testing Early testers confirm it can scan a full repository, refactor many files at once, and even build a starter application in a single step. They also note that language conversions, such as “Ruby to JavaScript,” complete quickly. Many teams, however, see slow responses and frequent downgrades to a smaller model called Flash. Rate-limit errors (HTTP 429) are common. Authentication is fragile on headless servers, some Workspace domains, and remote shells. Windows users who rely on screen readers cannot navigate the menus. Several high-value features are still missing. Gemini CLI cannot spin up smaller, task-specific agents the way Anthropic’s Claude Code can. Security teams cannot yet set fine-grained rules to block destructive commands or restrict write locations. Architects want a simple YAML file to mark module boundaries, so the agent reads only the code it needs. Others ask for an “offline mode,” Neovim support, and a single-file installer instead of a Node/NPM stack. Account and billing paths add complexity. Personal Google accounts get a free tier. Workspace logins require a Cloud project and may trigger Vertex AI charges. API keys follow AI Studio quotas. Paid subscribers to Gemini Pro, Google One, or the new AI Ultra plan do not know what carries over. Enterprise licenses (“Standard” and “Enterprise”) exist, but how they relate to the CLI is unclear, and some student or legacy domains face extra delays. Privacy messaging is inconsistent. One document says prompts and code are not stored; another states that personal-tier traffic is reviewed for model training unless users find and disable a hidden option. Google has clarified that only the free personal tier feeds training data; paid tiers do not. Optional telemetry can be turned off, but past opt-out incidents keep some teams wary. Reliability concerns mirror other AI coding tools: incomplete edits, duplicated blocks, broken imports, and Unicode errors. Animated status lines can hide inactivity. When projects grow large, the agent sometimes stalls or proposes broad refactors instead of asking for guidance. Bigger Picture Over the past year, command line AI tools have proliferated. OpenAI Codex CLI, Anthropic’s Claude Code, Copilot CLI, and others have shown that many engineers prefer to keep their hands on the keyboard, even when interacting with a large language model. At the same time, Gemini CLI fills a gap in Google’s broader stack. The same 2.5 Pro model now powers Code Assist in VS Code, Jules for asynchronous pull request reviews, and Vertex AI for hosted inference. By allowing the terminal agent to call those services — and by enabling it to run inside CI scripts, build pipelines, and container images — Google turns Gemini into an orchestration layer that spans both local machines and Google Cloud. Every free user who connects Gemini to their build scripts adds future momentum for paid tokens, higher tier licenses, or increased Vertex AI usage. This freemium funnel mirrors the playbook Google once used with Gmail and later with Kubernetes. MCP, first championed by Anthropic and now gaining traction across the industry, allows any compliant agent to connect with external data sources without custom adapters. Gemini CLI focuses on making Gemini the easiest path when teams begin connecting AI to databases, observability systems, or internal knowledge bases. For now, Gemini CLI puts a substantial stake in the ground willing to subsidize a large number of tokens to earn that mindshare.
Dmitry Baraishuk • 5 min read
Microsoft Copilot AI vs ChatGPT
Microsoft Copilot AI vs ChatGPT
Adoption patterns Microsoft can boast impressive breadth: roughly seven in ten Fortune 500 companies have at least opened a Copilot evaluation, and a few “multiple dozen” customers license the tool for more than 100,000 employees each. Yet most organizations still confine Copilot to small cohorts of hundreds or low thousands.  OpenAI, by contrast, reports depth: about three million paying business users log into ChatGPT Enterprise — up 50 percent since early 2025 — and many of them arrived after first experimenting with Copilot. Frequently, the flow begins bottom-up, with staff using the public ChatGPT site. Once IT departments notice the grassroots demand, they buy enterprise licenses to regain control over data. Mixed estates New York Life, for example, is keeping both assistants running company-wide while it gathers evidence before committing to a default. Bain & Company shows a ratio of roughly 16,000 ChatGPT seats to 2,000 Copilot seats. Even at firms that made early bulk purchases — Amgen bought 20,000 Copilot licenses in 2024 — user behavior has drifted: the majority of Amgen’s employees now turn to ChatGPT for research and summarization, relying on Copilot mainly for Outlook and Teams chores. Observed user behavior Power users seeking precision jump to ChatGPT when Copilot clips their prompt or refuses a complex query.  Casual workers stay inside native apps, clicking the Copilot side-panel in Word or Outlook because it is already docked there.  Legal staff funnel contracts into ChatGPT for accelerated summary, while finance departments draft routine emails through Copilot.  Perceived quality also swings: employees report that Copilot’s answers sometimes feel “smarter” late at night — likely a symptom of upstream load-balancing — whereas ChatGPT shows steadier performance. Price lists Copilot for Microsoft 365 is at a flat $30 per user per month, but that apparently predictable fee hides a hard throttle on tokens. Heavy users regularly exhaust their monthly quota within days and then face truncated or refused answers.  ChatGPT Enterprise has long been quoted at around $60 per user per month, yet the model is shifting to true consumption billing, so effective cost rises or falls with workload.  Token policy is therefore a differentiator. Where Copilot protects Microsoft’s cloud margins by clipping requests, ChatGPT lets customers buy extra capacity on demand, which finance teams appreciate because spending scales transparently with use. Discounts Microsoft often bundles Copilot into larger E5 or Azure deals, shaving the headline price. OpenAI reciprocates by trimming ChatGPT fees for customers that also adopt its vector-retrieval, embedding, or code-agent products.  In reality, firms do well to model cost at workload granularity: literature searches and policy summaries may prove cheaper in ChatGPT on a per-thousand-token basis, whereas bursty scripting inside Visual Studio Code may fit Copilot’s flat rate. Distinct strengths Copilot Copilot can read and write directly inside Outlook, Teams, Word, Excel, PowerPoint, OneDrive, Planner, Dynamics 365, Fabric, and Windows itself, eliminating copy-and-paste steps. Because it inherits Microsoft 365 identity and access controls, confidential files stay fenced within the tenant, and certain offline actions work even when no browser is open. Yet that integration imposes drag. Every OpenAI model upgrade must pass Microsoft’s validation pipeline, so Copilot can lag the public GPT release by weeks or months.  The assistant adopts a conservative stance: responses are frequently brief, include cautionary notes, or may be declined responses are abbreviated, sprinkled with disclaimers, or sometimes refused outright. Lengthy documents are prone to truncation when they hit token ceilings, and context is not always as deep as marketing suggests — SQL Server users, for example, still have to paste queries manually so Copilot can “see” them. ChatGPT Enterprise It receives the latest GPT weights within days, boasts a long context window that swallows sprawling PDFs without truncation, and greets users with a single, no-frills web interface. Its open APIs favor custom retrieval-augmented generation pipelines, agent orchestration frameworks, and partner plugins. That freedom, however, requires more governance effort. ChatGPT cannot natively mine a user’s mailbox or calendar unless IT builds or buys connectors. Security teams must translate OpenAI’s contractual language into existing control frameworks, and enthusiastic employees sometimes craft their own mini-workflows that skirt official policy. Model selection GitHub Copilot Chat exposes a drop-down that lets developers choose among GPT-4, Claude, or lower-cost open models. Savvy teams can dial quality down when a quick, inexpensive answer suffices.  By contrast, Microsoft 365 Copilot hides model choice entirely — every prompt is routed to the default engine, and the organization can neither swap nor tier service levels by task.  Security and compliance Both products promise that prompts and completions are excluded from model training, and both carry ISO 27001 and SOC 2 Type II certifications. FedRAMP High remains in progress for Copilot, with ChatGPT on a similar roadmap. Data residency follows the customer’s chosen region on either platform, although Copilot’s advantage is that it automatically respects the Azure tenant boundary, whereas ChatGPT needs explicit configuration. That simplicity matters in tightly regulated sectors. Still, one governance warning looms: Copilot’s “cross-tenant indexing” switch essentially invites the assistant to crawl every file in the domain. Treat that toggle as a formal, change-controlled event, and mirror the setting in a red-team sandbox first. Branding and support “Copilot” now labels an ever-expanding menagerie — GitHub Copilot, Microsoft 365 Copilot, Windows Copilot, Copilot Studio, and more — each with quirks that confuse end users. Microsoft’s renaming policy also muddies training material. Think of Office morphing into “Microsoft 365” or Azure Active Directory rebadged as “Entra ID.” Help-desk staff waste precious minutes on the opening question: “Which Copilot are you using?” By contrast, ChatGPT maintains a single domain and login, and its add-ons carry distinct names. External competition Anthropic’s Claude 3, Google’s Gemini 2.5, Mistral 3.1, Meta’s open-weight Llama variants, and small-footprint models like Gemma-1B are nipping at specialized tasks for a fraction of the price.  Router services such as OpenRouter and NotDiamond.ai already switch between engines on a per-prompt basis to optimize cost or quality, normalizing multi-model stacks.  Google’s Gemini “AI Mode” tab scores strongly on web research, though its enterprise licensing remains immature. On top of everything, tools like Perplexity and Cursor add retrieval layers or IDE context to OpenAI and Anthropic models, carving out niche returns on investment. Relationship between Microsoft and OpenAI After injecting roughly $14 billion into OpenAI, Microsoft enjoys a claim on up to 49 percent of OpenAI profits until an agreed cap and gains preferential model access for Copilot.  The price is control: OpenAI cannot ship a new model into Copilot without Microsoft’s vetting. OpenAI, for its part, hedges by buying companies such as Windsurf — a code assistant that competes directly with GitHub Copilot — and by courting other clouds. Microsoft, meanwhile, funds rival model labs and is building its own family of large language models. Antitrust regulators may examine that revenue-sharing lock-in after 2026 - a ruling could force more symmetrical model releases across vendors. What should an executive team do?  First, run parallel pilots of both assistants for at least a month, covering the full spread of everyday workloads — email drafting, meeting recap, literature search, code generation, spreadsheet formulas — then capture token consumption, output length, and error rates. Assign each workload to the cheapest tool that meets quality and compliance needs.  Second, budget for hidden compute: Copilot’s flat fee may look lower until users hit the token ceiling, while ChatGPT’s usage billing can spike during ad-hoc reporting. Set aside a contingency operating expense line worth about 20 percent of projected AI spend. Third, negotiate service levels around model parity. Ask Microsoft to guarantee that every new OpenAI model will reach Copilot within, say, 30 days, and back the request with exit or true-up clauses if they refuse. Ask OpenAI, conversely, to give two quarters’ notice before any major price scheme change.  Fourth, strengthen governance early: nominate data-classification owners to sign off on any enterprise-wide ingest, apply role-based access to sensitive channels, pipe all prompt traffic into the SIEM for anomaly detection, and maintain a red-team mirror tenant to spot policy drift after each release. Finally, prepare for a multi-agent future. Cursor, Perplexity, and lean open-source models are already filling specialist gaps. Building an API gateway or routing layer today will let tomorrow’s agents plug in under a common audit pipeline, minimize vendor lock-in, and keep compliance logging consistent.  In short, the smartest strategy is not to choose a single assistant forever, but to cultivate the flexibility to route each task to the best, cheapest, or safest engine available.
Dmitry Baraishuk • 5 min read
Goldman Sachs AI Assistant
Goldman Sachs AI Assistant
What the assistant does Inside Goldman’s firewall, GS AI Assistant can summarize dense documents, draft emails, pitch decks, and research notes, run descriptive analytics, translate client material, and act as both a Developer Copilot and an early-stage Banker Copilot.  The team is also building multi-step, agent-style behavior so the assistant can carry out complete workflows on a user’s behalf. During the pilot, thousands of engineers used the Developer Copilot daily and reported productivity gains.  Underlying architecture The assistant is behind an internal compliance gateway that routes each prompt to the large language model best suited to the task.  Today, GS AI Assistant uses OpenAI’s GPT, Google’s Gemini, Anthropic’s Claude, and several vetted open source models.  End users choose the model, while the bank maintains an audit trail and can swap models without retraining staff. Responses are generated first from Goldman’s proprietary data, and developers expect the system to incorporate increasing amounts of internal context over time. Strategic intent Chief Information Officer Marco Argenti lists AI, cloud migration, and data quality improvement as top technology priorities.  CEO David Solomon views AI as a way to simplify and modernize an aging technology stack and improve firm-wide productivity.  Executives describe the assistant as augmenting, not replacing, staff. It should reduce manual tasks so employees can focus on higher-value work such as judgment-based decisions and client relationships. Early concerns about jobs The internal memo shows that tasks traditionally performed by junior bankers will be automated.  External studies estimate that up to 200,000 Wall Street jobs could disappear over five years, with back office and entry-level roles most exposed.  Current evidence mainly shows task-level savings. For example, one investment bank reported that it no longer needed to hire additional operations clerks who would have handled routine reply emails manually due to the help of an AI system. The data bottleneck Most financial sector AI budgets now go to cleaning, federating, and governing data.  Banks that lag in data modernization risk hallucinations, compliance breaches, and poor user experience.  Goldman’s multi-year cloud program and earlier automation projects provide a head start, but substantial work remains. Competitive context Goldman is the first Tier-1 investment bank to make a multimodel assistant available to its entire workforce, creating a new industry baseline. Market reaction After the memo leaked, Goldman’s shares closed at $646.88, up just under one percent. The consensus 12-month price target of $594.85 implies modest downside. Analyst ratings center on “outperform”, suggesting expectations already assume successful execution. Why the launch matters Routing prompts through a model-agnostic, compliance-focused gateway shows a scalable, regulator-oriented architecture.  The router-plus-firewall pattern is emerging as a template for other regulated firms.  The move is likely to intensify competition among banks and raises questions about whether existing data infrastructure elsewhere can support similar scale. Implications for vendors and peers Goldman Sachs’s firm-wide release of GS AI Assistant signifies that AI is moving from pilot projects to core products. Other large financial institutions will need to decide whether to accelerate data modernization and governance efforts as the industry shifts toward large language models as standard enterprise tools. Priorities for the next 18 months include deploying model-agnostic routers, designing human-plus-AI workflows, and strengthening defenses against prompt injection and data leakage. Delaying these steps could raise future costs and leave late adopters tied to frameworks defined by first movers. Requests for proposals increasingly treat summarization, code assist, and data analytics as baseline features. Investment in data quality tooling, regulatory technology, and AI security services will be needed to meet these requirements within typical two-year ROI windows. Generative AI adoption in U.S. banking Sector-wide, more than eight in ten U.S. financial institutions either use or plan to use generative AI tools, and a significant share have increased their 2025 AI budgets. Workloads now cover trading, payments, risk management, HR, and marketing. JPMorgan Chase’s “LLM Suite” is available to more than 200,000 employees and anchors an annual technology budget of about $18 billion, with roughly 100 additional tools in the pipeline. JPMorgan CEO Jamie Dimon has told shareholders he aims to “win the AI arms race”. JPMorgan reports “vibe coding” sessions in which plain-language prompting saves developers several hours per week. Morgan Stanley is running about 30 active AI projects through an internal idea conversion program and an OpenAI partnership, overseen by a newly appointed head of AI. Citi CEO Jane Fraser is compressing a four-phase modernization plan, led by CTO Shadman Zafar, into a two-year schedule. Bank of America states that more than 90 percent of its 213,000 staff use its “Erica for Employees” assistant, reducing IT service desk calls by over 50 percent. Finance chiefs note, however, that most savings are recorded as time rather than immediate profit and loss benefit. Hard-dollar impact is expected to show in 2026 efficiency ratio targets. Generative AI is changing skill profiles. Business Insider reports that the technology is reshaping software developer roles and how junior bankers differentiate themselves, while also creating new C-suite positions focused on AI governance. Bank leaders say they are struggling to keep pace with AI-powered cyberattacks. Federal Reserve Governor Michael Barr highlights a twentyfold increase in deepfake fraud attempts since 2022. The EU AI Act becomes partly applicable in August 2025, requiring transparency, robustness, and human oversight for “high-risk” models. U.S. legislative proposals, such as California’s draft AI safety bill, would require model “kill switches” for powerful systems. Boards are therefore demanding stronger controls, on-premises retrieval-augmented architectures, and stress-test-style red-teaming. Banks are also beginning to test autonomous agents that can execute narrowly scoped tasks. HSBC, for example, is exploring AI bots to automate back-office analytics and small trading or reconciliation activities. Business Insider notes that institutions are already planning deployments of autonomous agents once guardrails are in place.
Dmitry Baraishuk • 4 min read
What Microsoft’s Copilot Vision Launch Means for Software Development
What Microsoft’s Copilot Vision Launch Means for Software Development
Copilot Vision is located in the existing Copilot pane, which now docks on the right edge of the desktop. A new eyeglasses icon opens a pop-up that lists every open window. The user simply selects which windows to share. Once permission is granted, the assistant can see whatever appears in those windows — such as an ERP dashboard, a CAD drawing, a code editor, a video game, or even two apps at once — and can respond in real time.  Users stay in control. Pressing Stop or X immediately ends the session. Nothing outside the selected windows is visible to Copilot. On some PCs, the feature will not start unless screen reader support is already enabled.  On mobile, a user just points the camera and speaks, and Copilot responds in the selected voice, for example, "Wave."  The same update adds Deep Research and File Search tabs for broader information retrieval. Microsoft describes this feature as a "second set of eyes."  Copilot now combines text and visuals, so it can summarize a PDF, explain a stack trace, draft an email, translate a road sign, suggest the next move in a game, guide a photo editing adjustment in Photoshop, improve lighting on a picture, review a travel itinerary and packing list, coach a vacuum cleaner repair, or walk a user through an unfamiliar Windows settings page — all without leaving the current context.  Because Vision can view two windows at once, it can connect information, such as matching calendar availability with dates on an events website or comparing spreadsheet data with a browser dashboard.  The Highlights mode deepens this experience. A simple request such as "show me how" causes Copilot to highlight the exact button, menu, or text field the user needs. The assistant can also show related content proactively when it recognizes that help could be useful. Vision is not Recall. There is no continuous screenshot capture, and users can revoke access at any time. Still, Microsoft acknowledges privacy concerns. Enterprises are advised to extend existing data loss prevention rules and audit logs to screen sharing with AI, especially because displays can show confidential information even if the underlying data never leaves the device. The product is part of Copilot Labs, Microsoft’s experimental incubator, and was previewed during the company’s fiftieth anniversary event in April 2025. Microsoft calls the launch a "major step forward," positioning it directly against Google Gemini Live and Apple Intelligence.  For software development teams, this change goes beyond convenience. Vision can process stack traces, terminal logs, design mockups, and Jira tickets in parallel, allowing an engineer to ask a single question instead of switching between tools. It can walk a junior developer through a refactor in Visual Studio by highlighting the right lines, compare a code difference to a specification during a review and explain any mismatches, and observe a failing user interface test while suggesting a possible root cause.  Organizations piloting this feature should reinforce secure coding practices, prompt engineering patterns, and data loss prevention policies that prohibit sharing production secrets or personally identifiable data. Executives should  update security training so staff understand that sharing a window means exporting its content.
Dmitry Baraishuk • 2 min read
Outage Affected Multiple Google Cloud Platform Products
Outage Affected Multiple Google Cloud Platform Products
Disrupted AI Services Used in Healthcare Google confirmed that four of its cloud-based products — Vertex AI Online Prediction, Dialogflow CX, Agent Assist, and Contact Center AI — were temporarily unavailable during the outage. Vertex AI, a machine learning platform adopted by hospitals, research institutes, digital health startups, and pharmaceutical firms, underpins diagnostic decision support systems, generates personalized treatment recommendations from patient data, powers risk scoring models, and streamlines a range of operational workflows. Dialogflow CX and Agent Assist, likewise growing in healthcare use, drive both clinical support functions and day-to-day administrative tasks. Contact Center AI processes scheduling, triage, billing inquiries, and other virtual “front door” services. Cloudflare Hit Cloudflare keeps the configuration data for its Workers KV platform inside Google Cloud. Once IAM was down, Workers KV lost access to that data and went offline at 18:19 UTC. Any Cloudflare feature that depends on KV — Access, WARP, Durable Objects, Turnstile, and parts of the dashboard—also stalled. Cloudflare’s own status page identified “a third party dependency” as the cause. Because many popular apps rely on either Google Cloud or Cloudflare, outages showed up almost at once on monitoring sites. Spotify, Discord, Snapchat, Twitch, Shopify, and several others reported slowdowns or errors, while Amazon Web Services saw a brief spike in customer complaints because some of its users route traffic through Cloudflare. What this means for leadership A service advertised as “globally replicated” still failed because it relied on an unseen external system. Modern cloud identity layers have become a choke point. When they fail, multiple vendors go down together. Traditional “multi cloud” strategies offer limited protection unless you also verify that vendors do not share the same upstream dependencies. Next actions Ask suppliers for clear maps of where their control plane components run. Require design changes (credential caching, local fallback modes) that let critical workloads keep running if an external identity service stalls. Include “identity layer outage” scenarios in resilience drills and board level risk reviews. Taken together, these steps reduce both the likelihood and the business impact of the next cross provider disruption.
Dmitry Baraishuk • 1 min read

Our Clients' Feedback

zensai
technicolor
crismon
berkeley
hathway
howcast
fraunhofer
apollomatrix
key2know
regenmed
moblers
showcast
ticken
Next slide
Let's Talk Business
Do you have a software development project to implement? We have people to work on it. We will be glad to answer all your questions as well as estimate any project of yours. Use the form below to describe the project and we will get in touch with you within 1 business day.
Contact form
We will process your personal data as described in the privacy notice
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply
Call us

USA +1 (917) 410-57-57

UK +44 (20) 3318-18-53

Email us

[email protected]

to top