The AI World Woke Up

October 11, 2024

It all starts so nicely and gently in March 2024. Not much happening but then WAM! This week goes off like a family pack of Mentos in a 3L bottle of Coke. Let's ease ourselves in...

Anthropic Updates - March 4th 2024

So, Anthropic is the company behind Claude AI. It was founded by former members of OpenAI, Daniela Amodei and Dario Amodei. A few months before this update, Amazon announced an investment of up to $4 billion.

In March, they announced the Claude 3 model family, consisting of three state-of-the-art models: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Each successive model offers increasingly powerful performance, allowing users to select the optimal balance of intelligence, speed, and cost for their specific application.

There are a few key features worth mentioning:

  • Claude 3 Opus outperforms its peers, including GPT-4, on most common evaluation benchmarks for AI systems, such as undergraduate-level expert knowledge (MMLU), graduate-level expert reasoning (GPQA), and basic mathematics (GSM8K). It exhibits near-human levels of understanding and fluency on difficult problems.
  • The Claude 3 models have sophisticated vision capabilities on par with other leading models. They can process a wide range of visual formats, including photos, charts, graphs, and technical diagrams.
  • The models have been developed to have a more nuanced understanding of requests and make fewer refusals compared to previous models.
  • Anthropic has dedicated teams that track and mitigate a broad spectrum of risks, ranging from misinformation and Child Sexual Abuse Material (CSAM) to biological misuse, election interference, and autonomous replication skills.

The Claude 3 models are now available for use in claude.ai and the Claude API, which is generally available in 159 countries.

MY VIEW: Anthropic's Claude 3 models are a game-changer in the AI world. With impressive performance, vision capabilities, and a focus on responsible development, these models set a new standard. Their commitment to powerful, versatile, and responsible AI is evident in the Claude 3 family. As these models become widely available, they're poised to revolutionise industries and open up new possibilities. Claude might be my new fave.

NVIDIA GTC - March 22nd 2024

Nvidia revealed Blackwell as the successor to their Hopper GPU architecture for AI. This is the graphics processing units, e.g. without them AI wouldn’t work. This new architecture promises up to 30 times greater performance and 25 times less energy consumption for massive AI models compared to Hopper. That is huge. It’s getting more output and having less environmental impact. Win win.

They also showcased new high-speed network platforms, including the NVLink Switch, enabling 10 times faster communication between GPUs compared to the fastest networks used previously, as well as updates and additions to its DGX Cloud service for AI development and deployment. They are building the playground for the big kids we know the names of to play in. Slides, swings and roundabouts galore.

MY VIEW: NVIDIA has been the silent winner in almost all of the AI developments over the last few years. Their stock price has gone through the roof as they are providing the processing power to do all the AI processing!

Then comes the week to end all weeks in updates. First up,

Apple Updates - May 12th 2024

Apple seems to be the proverbial 'playa'. They have loads of people bought into their hardware and don’t they know it? They are showing a ‘bit of leg’ to OpenAI (and probably every other AI company) and seem to be leaning towards allowing the idea of getting into bed with Sam Altman and his crew.

The new ChatGPT MacOS app seems to be signalling an intent to forge a behemoth partnership. It’s not set in stone but with the advent of the Vision Pro, it seems like the emerging tech is becoming a little more connected. Tim Cook,CEO at Apple, has made it loud and clear that they're going all-in on AI.

They have recently introduced their latest iPad Pro lineup, which is powered by the company's newly developed M4 chip. Apple claims that the M4 chip boasts a CPU performance that is 50% faster than its predecessor, the M2. Additionally, the M4's GPU delivers an impressive four times the performance of the previous generation. However, the most remarkable feature of the M4 chip is its neural engine, which surpasses the capabilities of any neural processing unit currently available in 'AI devices' – devices that perform AI tasks locally, without relying on cloud computing.

The introduction of the M4 chip in the new iPad Pro lineup showcases Apple's commitment to pushing the boundaries of mobile computing and artificial intelligence, setting a new benchmark for performance and efficiency in the tablet market.

The annual Worldwide Developers Conference, happening on June 10, is set to be a mind-blowing reveal of all the AI goodies Apple's been cooking up. But we'll have to wait until September for the full public release. By baking chatbot features right into iOS 18, Apple's showing they're dead serious about making AI a core part of the user experience. They're not just following the crowd; they're leading the charge.

Apple's not just talking the talk; they're walking the walk. They're on a mission to completely reshape how we interact with technology, making our lives smoother and more connected than ever before.

MY VIEW: Apple's appetite for AI is as clear as day, and we're all waiting with bated breath for the juicy details to drop at the Worldwide Developer Conference next month. All eyes will be on how Apple plans to weave AI into their grand tapestry. If they go in with OpenAI, it could be the end for some of the other competitors and maybe even Microsoft will have to duck out of their investment in OpenAI.

ChatGPT Spring Update - May 13th 2024

The highlight of OpenAI's Spring Update is the introduction of GPT-4o, a cutting-edge AI model that can process and reason across multiple modalities, including audio, vision, and text, in real-time. This advanced model represents a significant leap forward in AI capabilities, enabling seamless interaction and understanding across different forms of data.

Key features of GPT-4o include:

  • Multimodal Processing. GPT-4o can process and understand audio, images, videos, and text simultaneously, allowing for more natural and intuitive interactions with users.
  • Real-Time Reasoning. The model can reason and provide responses in real-time, making it suitable for applications that require immediate feedback or decision-making.
  • Enhanced Language Understanding. GPT-4o has improved language understanding capabilities, enabling more accurate and contextual responses to user queries and prompts.
  • Expanded Knowledge Base. The model has been trained on a vast corpus of data, giving it a broad and deep understanding of various topics and domains. It’s up until 2023 rather than 2021.

In addition to GPT-4o, OpenAI is also expanding the capabilities of its popular ChatGPT model, making more advanced tools and features available for free to users. These include:

1. Code Interpreters: ChatGPT will now be able to understand and interpret code snippets, making it a valuable tool for developers and programmers.

2. Knowledge Retrieval: The model will have access to a vast knowledge base, allowing it to retrieve and provide relevant information on a wide range of topics. This includes memory of your conversations with it.

3. Improved Reasoning and Analysis: ChatGPT's reasoning and analytical capabilities have been enhanced, enabling it to provide more insightful and well-reasoned responses.

4. Expanded Language Support: The model will support a wider range of languages, making it more accessible to users around the world.

MY VIEW: OpenAI were the first movers and they proved it again before Google’s big announcement the next day. Even though co-founder and chief scientist Ilya Sutskever, who has questioned Sam Altman’s ambition to deploy AI too quickly, announced he was leaving to create a start-up, their grip on AI is strong. These updates to the free model + the idea of memory and improved functionality for developers - has won them over to the general public who get it free AND the serious code-monkeys who will use it for interpreting and debugging.

They got in before our next update - and made some serious noise. Loud enough to be heard but not as loud as the Hawaiian shirt that is…

Image Source: Midjourney: ai machine wearing hawaiian shirt on an acid trip -- ar 16:9

Google I/O - May 14th 2024

This was what I imagine it was like going on an acid trip in the ‘70s. You felt connexted but strnagely like you were in another dimension. Google know how to put this show on but there were some significant updates - well over 100 in fact. I will give some that I think are relevant.

  • Gemini 1.5 Flash is a lighter-weight model that’s designed to be fast and efficient to serve at scale. 1.5 Flash is the fastest Gemini model served in the API
  • Gemini 1.5 Pro is also available with a 2 million token context window to developers. This is equivalent to about 1.3 million words or over 10 novels. CNET described this as: “The more tokens a context window can accept, the more data you can input into a model. The more data you can input, the more information the AI model can use to deliver responses. The better the responses, the more valuable the experience of using an AI model.”
  • Project Astra is the next generation of AI assistant. It remembers, communicates and understands in almost real time. The video is the only way to do this justice (see below)
  • Imagen 3 is their new image generation tool that understands natural language and intent behind your prompts and incorporates small details from longer prompts to produce photorealistic, lifelike images and it can create text based images better! There are also some new safety measures for copyright and verification.
  • Veo is their new video generation tool. 1080p.
  • Music AI Sandbox is their new generative music tool with collaborations with loads of artists, including Wyclef Jean.
  • The Gemini side panel in Google Workspace will bring the power of GenAI within Drive, Docs, Slides, Sheets and the like - this is a brilliant but paid for feature.
  • Notebook LM is an experimental AI tool by Google that aims to reimagine notetaking and research. Users can upload their own documents to create a personalised virtual research assistant versed in their specific information. It automatically generates summaries, key topics, and suggested questions to help users understand complex material. Users can ask open-ended questions and receive answers grounded in their provided sources. Notebook LM can also assist with creative tasks like idea generation based on the user's inputs. Google is rolling it out gradually to develop the technology responsibly, keeping user data private. The core value is providing an AI collaborator to help users gain insights and accelerate work using their trusted sources.
  • LearnLM aims to make learning more engaging, personal and useful by inspiring active learning, managing cognitive load, adapting to the learner, stimulating curiosity, and deepening metacognition. It will enhance learning experiences across Google products like Search, YouTube, and when chatting with Gemini. In Search, users can adjust AI overviews to simplify language or break down complex topics. On Android, Circle to Search will help solve complex math/physics problems involving formulas and diagrams. Gemini will offer "Gems" - customised AI assistants for specific topics, including a "Learning Coach" to guide studying. On YouTube, an AI tool will allow users to ask clarifying questions and take quizzes while watching educational videos.
  • There are two new experimental tools leveraging generative AI for learning:
    • Illuminate - Breaks down research papers into audio conversations for easier understanding.
    • Minerva - Helps create interactive study guides from long videos by summarising, transcribing, and enabling Q&A.
  • And some new Android features but my brain was hurting by this point!

MY VIEW: Google has gone hard. They are playing catch up to get back market share and I think they are moving fast and making big strides. They have pissed people off with their pricing but I do think they will win out. Their UI and ease of use + their web dominance will make them a force to be reckoned with.

YouTube Video: Project Astra

UAE Announcement - May 15th 2024

The UAE is proving it's a force to be reckoned with in the global AI race. Abu Dhabi's Technology Innovation Institute (TII) just dropped the Falcon 2 series - a text-based model and a vision-to-language model that's open-source. Emirati firm G42 even ditched Chinese investments and hardware to secure a sweet $1.5B from Microsoft.

With the upcoming Falcon 3 generation and the UAE's knack for swift strategic decisions, they're poised to make some serious waves. The ruling family's got deep pockets too - we're talking $1.5 trillion in sovereign wealth funds. They're using that cash to fuel growth in AI and other cutting-edge tech.

The UAE's even launching a 'Generative AI' guide to help everyone from government agencies to entrepreneurs harness the power of AI.

MY VIEW: Wherever there is the cash of these oil-rich nations, there is potential for significant disruption. Whether it will go mainstream? I am not so sure but it might force the others to move faster!

So, corporations, countries and cooperatives are taking this seriously and making big waves right now. But in all truth, by the time this goes out, it’s not worth the email it’s written on. It will be out of date quicker than you can spell AI…

On a related note, I can highly recommend "The Coming Wave: Technology, Power, and the Twenty-First Century's Greatest Dilemma" by Mustafa Suleyman and Michael Bhaskar. They explore the transformative potential and profound impact of emerging technologies on society, politics, and the global economy. The authors delve into how innovations in AI, biotechnology, and other advanced technologies are reshaping industries, economies, and everyday life, leading to significant and unprecedented changes.

The book examines the shifting power dynamics between nations, corporations, and individuals brought about by these technological advancements. It raises crucial questions about control, influence, and inequality, highlighting the ethical dilemmas and societal impacts, such as privacy concerns, job displacement, and the potential for misuse of technology.

It’s a call to arms for thoughtful regulation and policy-making to harness the benefits of new technologies while mitigating their risks and ensuring equitable outcomes. They provide a comprehensive analysis of the current technological landscape and speculate on the future trajectory of technological development, urging readers to consider the long-term consequences of today's technological decisions.

It’s well worth a read (even if all the updates are out of date!)

Until Skynet…

Subscribe Now

Subscribe to receive the latest blog posts directly to your inbox every week.

By subscribing, you agree to our Privacy Policy.
Thank you! Your submission has been received!
Oops! Something went wrong. Please try again later.