Full list of content consumed, including annotations
54 highlights & notes
24 minutes Engaged reading, read (01/03/23)
“For me, the big story about #gpt3 is not that it is smart — it is dumb as a pile of rocks — but that piles of rocks can do many things we thought you needed to be smart for. Fake intelligence may be dominant over real intelligence in many domains.”
“Pattern recognition is intelligence!”What sets most of the highest performers apart? They are highly skilled at spotting, emulating, and combining patterns of success most of us don’t see. In our efficiency-driven, largely conformist society, you can easily argue economic success is 95% pattern recognition, and 5% bold originality. Most jobs don’t ask us to be original.
Fake intelligence may be dominant over real intelligence in many domains.
Why is OpenAI suddenly okay with providing access to this?First and foremost: one cannot help but think it warrants closer scrutiny that OpenAI said GPT-2 was too dangerous, but only months after adding a for-profit arm to its operations, was suddenly okay with making available something significantly more capable and dangerous. GPT-3 was released less than two weeks before the introduction of the OpenAI API to commercialize its AI.
4. Will people stop sharing their information and insight online?The fact of the matter is that GPT-3 derives its power from humanity’s collective knowledge. Its parameters are based on Common Crawl — a broad scrape of the 60 million domains on the internet along with a large subset of the sites to which they link — as well as Wikipedia, and historical books. If these same people begin losing their jobs as a result of insight they contributed online, will they still want to? Will there be a revolt when this becomes common knowledge? Will this open a new wave of IP protection issues? Will people be able to legally demand their insights aren’t included in an AI’s dataset? Could people choose to collectively cripple GPT-3’s dataset? Should they?
We're already seeing this happen with paywalls and online communities.
Is it time to start taking UBI (Universal Basic Income) much more seriously?
YES!
Unemployment shouldn’t have been what we were ever looking for as a deeper warning sign. It’s underemployment, i.e. jobs that are well below the intellectual demands, monetary rewards, and lifestyle stability that the person’s training and capabilities should have generated.
100%
The central aspect people tend to overlook when arguing that automation just creates other jobs to replace the ones it eliminates, is that previous automation could not create other automations, or be so broadly and flexibly applicable. We also forget that the vast majority of automation was mechanical, and then rote information automation. None of it was credibly intelligent. Once intelligence happens, it overlaps with people and the value they add by thinking. Thinking, or at least what seems like thinking, as GPT-3 has shown us, is no longer job protection. The versatility of new automation negates the supposed creation of sufficient new jobs.
interesting point
medium.com |
Five years ago, Tim Urban published The AI Revolution: The Road to Superintelligence. Two images have always stuck with me from that post. Tim’s timeline of AI’s inevitable impact on human progress…
37 minutes Engaged reading, read (06/13/24)
I.—COMPUTING MACHINERY AND INTELLIGENCE
I don't think this article should be in the collection. To much into the weeds.
This argument is very well expressed in Professor Jefferson's Lister Oration for 1949, from which I quote. “Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by the chance fall of symbols, could we agree that machine equals brain—that is, not only write it but know that it had written it. No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants.”This argument appears to be a denial of the validity of our test. According to the most extreme form of this view the only way by which one could be sure that a machine thinks is to be the machine and to feel oneself thinking.
academic.oup.com |
I propose to consider the question, ‘Can machines think?’ This should begin with definitions of the meaning of the terms ‘machine’ and ‘think’. The definit
67 minutes Engaged reading, read (03/04/24)
A simplistic overview of AI and all its relayed components — i.e., AI for dummies. Highly recommend!
Understanding the influence of computational infrastructure on the political economy of artificial intelligence is profoundly important: it affects who can build AI, what kind of AI gets built, and who profits along the way.
A recent report from Andreessen Horowitz describes compute as “a predominant factor driving the industry today,” noting that companies have spent “more than 80% of their total capital on compute resources.”4
When we use the word “compute,” we sometimes mean the number of computations needed to perform a particular task, such as training an AI model. At other times, “compute” is used to refer solely to hardware, like chips. Often, though, we use “compute” to refer to a stack that includes both hardware and software.
Since 2015 however, trends in compute growth have split into two: the amount of compute used in large-scale models has been doubling in roughly 9.9 months, while the amount of compute used in regular-scale models has been doubling in only about 5.7 months.13
Running data centers is likewise environmentally very costly: estimates equate every prompt run on ChatGPT to the equivalent of pouring out an entire bottle of water.
But this push to “democratize” access to compute proceeds from the knowledge that compute-intensive research is largely dominated by industry, even in academic settings: in recent years, the largest academia-developed model used only 1 percent of the compute used to train the largest industry model.51 Concerns over the imbalance between industry and academic research in AI is the driving premise of the National AI Research Resource (NAIRR),52 and led the UK to announce plans to spend £100 million on acquiring compute for the country’s benefit.
Puts a bit of a damper on the thinking that AI will cure disease, aid doctors in diagnosis, and help solve climate change. If large research universities don't have access to compute progress will be slowed.
Compute costs are predictably large: the final training run of GPT-3 is estimated to have cost somewhere between $500,000 to $4.6 million.55 Training GPT-4 may have cost in the vicinity of $50 million,56 but the overall training cost is probably more than $100 million because compute is required for trial and error before the final training run.
Compute is required both for training a model and for running it. For instance, one GPT-4 training run requires a huge amount of compute. But every question posed to ChatGPT also uses compute in generating a response. The latter is known as inference. An inference happens anytime a model generates a response.
Graphics Processing Units (GPUs): GPUs were initially designed for image processing and computer graphics, and excel at running many small tasks at the same time. Because of this, they are well suited for building AI systems: they can carry out calculations in parallel while sacrificing some precision, improving AI-related efficiency over Central Processing Units (CPUs), which are designed to serve as the core component of a computer to run a centralized operating system and its applications.
Nvidia’s H100 GPU now sets the industry standard
It’s worth noting that both OpenAI and DeepMind made decisions to be invested in and acquired,132 respectively, primarily due to the costs of compute.
Some experts believe that Moore’s law will inevitably slow down as transistors approach physical limits of size—they are now only a few atoms wide.148 Additionally, the rate of speed and efficiency improvements from an increase in the number of transistors is itself slowing but consistent.149Others believe that innovations always emerge to keep growth consistent at a Moore’s law level.150 They use DNA as an indication that physical and chemical switches can be much much smaller and that it is physically possible for very small items to carry a lot of information.
Another way to reduce compute costs is to use existing compute more efficiently. This can be done by making smarter algorithms that use less compute for the same output.
AI researchers could reduce compute use by abandoning the current method of building very large models and instead seeking capability improvements through smaller models. There are few indicators that companies are seriously pursuing this strategy, even despite considerable harms associated with large-scale AI
Paradigm shifts in compute development, such as neuromorphic computing or quantum computing, could create an entirely new market structure and much higher compute capacity.
A separations regime as described above can prevent some of this wanton data use, but laws targeted toward data protection and sharing can go further. At the very least, clarifications on the legality of using data from other services to train AI models can deter such use. More watertight enforcement of the law will require monitoring mechanisms. For instance, recent work shows that the data used to train a model can potentially be verified.196 Governments can also act to further protect domain-specific data, including healthcare and education data where a concentrated AI market can be especially damaging.
ainowinstitute.org |
4 minutes Engaged reading, read (06/27/24)
Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response. Large Language Models (LLMs) are trained on vast volumes of data and use billions of parameters to generate original output for tasks like answering questions, translating languages, and completing sentences. RAG extends the already powerful capabilities of LLMs to specific domains or an organization's internal knowledge base, all without the need to retrain the model. It is a cost-effective approach to improving LLM output so it remains relevant, accurate, and useful in various contexts.
aws.amazon.com |
What is Retrieval-Augmented Generation how and why businesses use Retrieval-Augmented Generation, and how to use Retrieval-Augmented Generation with AWS.
10 minutes Engaged reading, read (06/24/24)
ig.ft.com |
The technology has resulted in a host of cutting-edge AI applications — but its real power lies beyond text generation
34 minutes Engaged reading, read (06/20/24)
Our first instinct when interacting with a Large Language Model should not be “wow these things must be really smart or really creative or really understanding”. Our first instinct should be “I’ve probably asked it to do something that it has seen bits and pieces of before”. That might mean it is still really useful, even if it isn’t “thinking really hard” or “doing some really sophisticated reasoning”.
One should always verify the outputs of a large language model. If you are asking it to do things that you cannot competently verify yourself, then you should think about whether you are okay with acting on any mistakes that are made.
Self-attention means that the more information you provide in the input prompt, the more specialized the response will be because it will mix more of your words into its guesses. The quality of response is directly proportional to the quality of the input prompt. Better prompts produce better results. Try several different prompts and see what works best for you. Don’t assume the language model “gets” what you are trying to do and will give its best shot the first time.
mark-riedl.medium.com |
This article is designed to give people with no computer science background some insight into how ChatGPT and similar AI systems work (GPT-3, GPT-4, Bing Chat, Bard, etc). ChatGPT is a chatbot — a…
1 minutes Engaged reading, read (02/20/25)
14 minutes Engaged reading, read (06/18/24)
Large AIs called recommender systems determine what you see on social media, which products are shown to you in online shops, and what gets recommended to you on YouTube. Increasingly they are not just recommending the media we consume, but based on their capacity to generate images and texts, they are also creating the media we consume.
Talk about a self fulfilling feedback loop.
ourworldindata.org |
Despite their brief history, computers and AI have fundamentally changed what we see, what we know, and what we do. Little is as important for the future of the world, and our own lives, as how this history continues.
22 minutes Engaged reading, read (02/28/24)
This ability to adaptively react to situations, and improve over time, is intelligence. The adjustment of your rules based on experience is learning.
Understanding the different elements of AI: 10 key definitions
1. Algorithm: sounds like a fancy word, right? Well, there’s nothing new or innovative about it. An algorithm is simply a recipe for dealing with a problem. It’s a set of steps to be followed. You could apply a personal algorithm - a fixed series of actions and responses - to how to perfectly deal with your mornings.
2. Rules / rule-based systems: rules are just the core algorithms that define an AI system. All interface programming - how you use a website, or your phone, or any digital device - is based on rules. The earliest version of Artificial Intelligence was an attempt to make an intelligent machine by giving it so many rules that it would be able to deal with literally every possible scenario. This is referred to as a "zero learning" system, for obvious reasons. It was basically running a massive list of algorithms, just regurgitating canned responses. Of course, life presents us with infinite possibilities, so this approach didn’t work out so well:
3. Data: is information. In the case of an AI system, it’s all the information it’s picking up from its surroundings, and about the information itself (e.g. trends in the information, outliers, etc).4. Model: is the version of the world an AI system is designed to interpret and focus on. E.g. a machine that sorts everything it sees - its visual information - into red objects, and green objects. It is a red-green sorting computer vision model. It sees the world in red and green.5. Machine learning: remember when we defined what intelligence was, and how learning made it possible by adjusting the rules with experience? Well, that’s why machine learning goes hand in hand with Artificial Intelligence! (It is not the same thing as Artificial Intelligence, though it is often incorrectly used interchangeably.) A "zero learning" network applies fixed algorithms to respond to situations, so it never learns. A machine learning system applies algorithms, but also checks the outcomes, and based on that adjusts its algorithms. Machine learning basically means using algorithms to adjust the system's original algorithms, based on the outcomes of the system. (Cue Inception meme.)Example:Rollie the alarm clock could have an algorithm that adjusts the length and volume of its alarm based on how effectively it woke the user over the past month.6. Training / training set: the process of teaching a system by supplying its algorithm data to learn from. Most machine learning systems today are started by feeding in a set of information that the system can calibrate itself on. (See: Supervised Learning in the Further Reading section at the end.)7. Neural Networks: a set of algorithms attempting to recreate (on a very limited scale) the dense interconnections of the human brain by densely interconnecting many simultaneously running AI models.8. Deep learning: a type of machine learning where multiple layers of algorithms are layered over each other, with the output from one cascading into the next. For example, when the output of a neural network is fed into the input of another, it becomes a deep neural network.9. Computer vision: one of the most immediately impactful uses of AI involves making sense of the data in images. Computer vision is the field dedicated to interpreting images and videos. It drives everything from automatic photo-tagging on Facebook, to how self-driving cars get around (and avoid killing people).10. Natural language processing (NLP): the other primary application of AI is in understanding the ideas and intent of language. NLP is a core part of AI and has been around since the advent of the earliest computers. Most recently, deep learning has been applied to NLP to astonishing results. Take this recent case, for example:
Most of the above concerns go hand in hand with the concept of Emergent Artificial Intelligence, which is the idea that Artificial Super Intelligence might unexpectedly, rather than deliberately, arise one day out of the increasingly complex AI systems we’re creating. This concept comes from the theory of Emergence, which is the idea of larger/ more complex things emerging from smaller/ less complex ones. Life on Earth is an example.
So even if we know that an AI made a bad decision, we don’t know why or how, so it’s tough to build mechanisms to catch bias before it’s implemented. The issue is especially precarious in fields like self-driving cars, where each decision on the road can be the difference between life and death. Early research has shown hope that we’ll be able to reverse engineer the complexity of the machines we created, but today it’s nearly impossible to know why any one decision made by Facebook or Google or Microsoft’s AI was made.
On one hand, AI today is nowhere near achieving super intelligence. Tech expert Ken Nickerson labels todays AI as more akin to AIS: artificial idiot savants. They are infinitely better than humans at a single or handful of tasks, but are otherwise totally useless, even compared to an average human child. He even goes so far as to suggest that the entire industry is on the wrong path, specifically that attempting to copy human intelligence so literally is no different from calling planes artificial birds and creating literal replicas, as early inventors did.
thecuriousreview.com |
From definitions to ethics, everything you were wondering about, simply explained in one place.
27 minutes Engaged reading, read (06/13/24)
If Kurzweil and others who agree with him are correct, then we may be as blown away by 2030 as our 1750 guy was by 2015—i.e. the next DPU might only take a couple decades—and the world in 2050 might be so vastly different than today’s world that we would barely recognize it.
Where We Are Currently—A World Running on ANIArtificial Narrow Intelligence is machine intelligence that equals or exceeds human intelligence or efficiency at a specific thing.
waitbutwhy.com |
PDF: We made a fancy PDF of this post for printing and offline viewing. Buy it here. (Or see a preview.) Note: The reason this post took three weeks to finish is that as I dug into research on Artificial Intelligence, I could not believe what I was reading. It hit me pretty quickly that what’s happening in the world of AI is not just an important topic, but by far THE most important topic for our future. So I wanted to learn as much as I could about it, and once I did that, I wanted to make sure I wrote a post that really explained this whole situation and why it matters so much. Not shockingly, that became outrageously long, so I broke it into two parts. This is Part 1—Part 2 is here.
10 minutes Engaged reading, read (06/24/24)
writings.stephenwolfram.com |
Stephen Wolfram explores the broader picture of what's going on inside ChatGPT and why it produces meaningful text. Discusses models, training neural nets, embeddings, tokens, transformers, language syntax.
6 minutes Engaged reading, read (06/27/24)
axios.com |
From hallucinations to transformers and AGI to LLMs, here's a crib sheet to all the AI lingo you're hearing.
1 minutes Engaged reading, read (02/20/25)
10 minutes Engaged reading, read (06/24/24)
ibm.com |
Artificial intelligence leverages computers and machines to mimic the problem-solving and decision-making capabilities of the human mind.
10 minutes Engaged reading, read (06/20/24)
retrieval-augmented generation (RAG) comes in. RAG provides a way to optimize the output of an LLM with targeted information without modifying the underlying model itself; that targeted information can be more up-to-date than the LLM as well as specific to a particular organization and industry. That means the generative AI system can provide more contextually appropriate answers to prompts as well as base those answers on extremely current data.
RAG lets the generative AI ingest this information. Now, the chat can provide information that’s more timely, more contextually appropriate, and more accurate.Simply put, RAG helps LLMs give better answers.
Key TakeawaysRAG is a relatively new artificial intelligence technique that can improve the quality of generative AI by allowing large language model (LLMs) to tap additional data resources without retraining.RAG models build knowledge repositories based on the organization’s own data, and the repositories can be continually updated to help the generative AI provide timely, contextual answers.Chatbots and other conversational systems that use natural language processing can benefit greatly from RAG and generative AI.Implementing RAG requires technologies such as vector databases, which allow for the rapid coding of new data, and searches against that data to feed into the LLM.
An additional benefit of RAG is that by using the vector database, the generative AI can provide the specific source of data cited in its answer—something LLMs can’t do.
RAG isn’t the only technique used to improve the accuracy of LLM-based generative AI. Another technique is semantic search, which helps the AI system narrow down the meaning of a query by seeking deep understanding of the specific words and phrases in the prompt.
Semantic search goes beyond keyword search by determining the meaning of questions and source documents and using that meaning to retrieve more accurate results. Semantic search is an integral part of RAG.
Benefits of Retrieval-Augmented GenerationRAG techniques can be used to improve the quality of a generative AI system’s responses to prompts, beyond what an LLM alone can deliver. Benefits include the following:The RAG has access to information that may be fresher than the data used to train the LLM.Data in the RAG’s knowledge repository can be continually updated without incurring significant costs.The RAG’s knowledge repository can contain data that’s more contextual than the data in a generalized LLM.The source of the information in the RAG’s vector database can be identified. And because the data sources are known, incorrect information in the RAG can be corrected or deleted.
oracle.com |
1 minutes Engaged reading, read (02/20/25)
1 minutes Engaged reading, read (02/20/25)
17 minutes Engaged reading, read (06/27/24)
Fine-tuning is the process of taking a pretrained machine learning model and further training it on a smaller, targeted data set. The aim of fine-tuning is to maintain the original capabilities of a pretrained model while adapting it to suit more specialized use cases.
Briefly, fine-tuning and transfer learning are strategies for applying preexisting models to new tasks, whereas RAG is a type of model architecture that blends external information retrieval with generative AI capabilities.
techtarget.com |
Learn about fine-tuning and its role in machine learning and AI, including its real-world applications and how it compares with RAG.
16 minutes Engaged reading, read (06/24/24)
Another breakthrough came in 1956 when Allen Newell, Herbert Simon, and Cliff Shaw wrote a computer program called Logic Theorist. This was the first program capable of performing automated reasoning to simulate the way humans think to solve problems.
Symbolic AI is different in that it’s a knowledge-based approach that aims to imitate human intelligence by following a set of pre-coded, symbolic rules of reasoning.
thats-ai.org |
From ancient mythologies to present day astonishment, discover the history of artificial intelligence (AI) and the individuals who helped shape it.
24 minutes Engaged reading, read (06/26/24)
Researchers don’t understand exactly how LLMs keep track of this information, but logically speaking the model must be doing it by modifying the hidden state vectors as they get passed from one layer to the next. It helps that in modern LLMs, these vectors are extremely large.
This division of labor holds more generally: attention heads retrieve information from earlier words in a prompt, whereas feed-forward layers enable language models to “remember” information that’s not in the prompt.
understandingai.org |
Want to really understand how large language models work? Here’s a gentle primer.
1 minutes Engaged reading, read (02/20/25)
16 minutes Normal reading, read (06/20/24)
youtube.com |
How programmers turned the internet into a paintbrush. DALL-E 2, Midjourney, Imagen, explained.Subscribe and turn on notifications 🔔 so you don't miss any v...
30 minutes Normal reading, read (06/20/24)
youtube.com |
🔍 In this video, we unravel the layers of AI, Machine Learning, Deep Learning, and their applications in tools like #ChatGPT and Google #BardWe first go thr...
1 minutes Engaged reading, read (02/20/25)