Before diving into the topic this week, you probably realized that you haven’t seen any posts in the last two weeks. There is a reason for that, for those who were unaware. In the cliff notes version it goes like this – I suffered a near-drowning episode at the end of July, which admittingly I cannot recall quite a bit, but I hit my head while under the water. This led to another concussion, which later became okay, but still is, somewhat much worse than expected. I took a few weeks off, and am slowly coming back. There you go. Oh, and I am doing well, solid – I take it day by day.
The trade show circuit is nearly upon us. Then you have more and more folks in L&D, Training, HR, and other departments seeking out learning technology and learning systems wanting AI. Next, there are vendors of all stripes pitching AI. Finally, there are companies, businesses, associations, and so forth, seeking to add AI. Corporate aims for Enterprise versions. EdTech (K-12, Higher Ed) aims across the place, with the assumption BTW that AI detection tools for plagiarism will work – A complete falsehood – so save your money.
Employees at your business, company, and entities are using ChatGPT as we speak. They are doing it in their day-to-day work. Regardless if they are entry-level or managers. The challenge is that many will not tell their boss they are doing so. You have HR departments who have no plan in place for dealing with this, let alone Legal getting involved, let alone the whole understanding of just everything around AI.
One company is already embracing AI for all their employees – Visa just announced that every employee will get access to GPT-4
This Guide hopes to resolve some of those issues.
Shameless Plug – I now offer a service where I can come or do it virtually, a presentation, 1/2 day or full-day workshop for L&D, HR, Training, EdTech, Education (association) folks who need a strategy and approach with Gen-AI and AI (machine learning is one type). The sessions go further than just this, but that can be discussed. If you are a learning technology or learning system vendor, I offer a service around Gen-AI, and AI – again, going in-depth – either online or on-site, presentation, 1/2 or full-day. If you are interested in learning more – please contact me HERE. The shameless plug is now complete. And it won’t ever be brought up again – Oh, this info is not yet on my biz site, due to the remodeling of the biz site.
The AI Guide – Asking the right questions – to vendors or even internally
This is not a full guide, but there is enough here to help you – a strong recommendation if you are talking to learning system vendors, learning technology, talent/performance development, and other entities. Plus plenty for those folks in L&D, Training, and HR who are recognizing that this is something they need to address. I decided to go with questions I hear quite a bit, plus a couple that folks think are this way, when in reality it is that way.
Did you know?
Gen-AI requires a high amount of computing power and thus, the carbon footprint exceeds cryptocurrency. The high amount of computing power also requires a large investment in such computers (with those chips), and thus a high cost. To cool down the computers, water comes in. There are real concerns around the water aspect, for those curious.
I won’t go into the in-depth LLM (Learning Language Model), but will say that the most popular, Gen-AI chatbot, ChatGPT is an LLM. It is just one of many LLMs that OpenAI (the company that developed it) is out there. There are a lot of companies out there that have developed their own LLM or lots of LLMs. Google is one, Microsoft is another – albeit the Azure one is OpenAI, and Amazon is one too. Meta has a few, and so on. Will Apple ever launch its own – publicly, I say yes – but when, no idea.
Nor will I go into the co-pilot or AI agents and so on, for this post. What I will though is go thru are the following:
- The Questions to Ask your Learning System, Learning Technology, HCM, HRIS, and whatever other solution you want to buy that pitches AI.
- The difference between machine learning and Gen-AI
- ChatGPT and your employees, students, members, customers, and the person down the street who refuses to wave to you, but you find out, is on that committee at your business that is deciding on whether ChatGPT can be contained. Oh, and they never buy candy bars from your kid to raise funds for their school. Cheapie!
- Those data sets – Private vs. Public – A lot of companies believe Private Gen-AI, will eliminate all hallucinations – Sorry, to say – you are wrong.
The Questions to Ask your Learning System, Learning Technology, HCM, HRIS, and whatever other solution you want to buy that pitches AI
I was going to only focus on our industry – learning systems, learning technology, and specifically e-learning (which is our industry). However, with HRTech nearly upon us, and DevLearn shortly, plus there are many folks in HR who purchase learning systems/learning tech – I do not want to leave them out when it comes to their purchasing of whatever system (non-learning) they are aiming to. Let’s though not make this a habit. : ) (I kid, or do I?)
What I tend to find when it comes to AI – especially Gen-AI (again, most folks think ChatGPT is the only game in town, and thus a vendor says ChatGPT – because it is common, as though you say Kleenex when you know the product is tissue paper), that they are all over the map with information. There are those who are somewhat knowledgeable, but I have yet to find anyone who knows it inside and out – unless they are from a vendor who only does AI and the person you are speaking with knows LLMs, and other items. For example, I spoke with the Co-Founder of Lucy.ai a while back, and he clearly was extremely knowledgeable. However, he is the co-founder, and we are talking about salespeople.
Thus, the somewhat knowledgeable to not knowing anything is for right now a common thread. Some are very honest and will find out the information for you, then as one sales exec told me about their salespeople (system to not be named), they do whatever they can to close the sale – even if it isn’t fully accurate. How nice.
If I were a vendor at a trade show, for example, I’d have one data scientist there or someone who can provide the answers you want. If they – the vendor doesn’t – then make sure, you get the person’s name and e-mail address who knows the answer. Don’t assume the vendor will get back to you on this – I find it very rare. Think this way when they scan your badge and tell you they will follow, because you are interested in their system, the actual % is beyond low. At a recent show I was at, 1 out of all the vendors I spoke with, ever followed up.
Items to Remember
- The probability that a vendor built their own LLM (and did not use a 100% open source freebie OR open source but you pay – uh they do) is low. Building your own LLM from scratch is extremely expensive, with no guarantee it will work. Let’s take Lucy.ai for example. They use the GPT Model, as I recall GPT-4 (now referred to as ChatGPT Plus). In the learning system space, the vendors who have added Gen-AI, overwhelmingly are using the LLMs from OpenAI – specifically the GPT Models. ChatGPT is free it uses GPT 3.5 – very affordable compared to GPT Plus which is very expensive.
- To date, again with our industry, corporate-wise, I have yet to find a vendor who is using another LLM, besides the one from OpenAI or the Azure one – which is the OpenAI one – GPT models. Even though, the majority of vendors in our industry are on AWS (Amazon). This to me is odd, because Falcon LLM is intriguing, Claude2 can do way more than ChatGPT – and it is also 100% free, Cohere – is totally Enterprise focused, and I could go on.
- I’ve written about the Token data fees, and why there are vendors who clearly don’t want to pay those fees – it gets really pricey, and thus, try to skirt – This will be a question to ask BTW. Here is the info about Token data fees and why Token data is so relevant.
- ChatGPT data set is from 2021. It is public — as in via the web, but only through 2021. Thus any information you might seek that is say current, won’t be there.
- Claude2 is from 2022, although surprisingly it was able to produce information around an Apple release in Feb 2023.
Let’s Ask Some Questions
There are questions you should ask any vendor whose offering you are interested in, that pitches they have AI.
Q: Is the AI, Generative-AI Machine Learning, or a combination?
Reason: That token fee is why there are vendors who skirt – in that, they may have some Gen-AI, and then use machine learning (no token fee). In the learning system space, I know of a vendor or two, that use Gen-AI, then another source for scoursing the net to bring back free content. Why? The Token data cost. Gen-AI is fee-based (they are using the GPT-4) and the other source is lower.
Q: What data sets did you train your LLM on? – This question is asked only if the vendor has Gen-AI – whose foundation is a LLM.
Reason: Nobody wants old data sets, and training data sets is crucial, it is the way Gen-AI learns. Did the vendor use public (say the net, scouring sites) or private? Yes, you will be able to data sets – depending on the LLM, but initially, the foundation has some data sets.
Bonus: If they say machine learning is their AI – it is also built on data sets. Ask them what data sets were trained and how old. Don’t get wrapped into the “the data sets are constantly updated, and blah blah.” No kidding. You want to know the original data sets.
Can data sets have bias? Absolutely – which is why you want to know about those data sets. Bias in AI is a real concern and is showing up quite a bit. Even with those who add their own data – say within a company, regardless of enterprise or not – I am referring to your place of business.
What LLM are you using? – This is only if they say they have Gen-AI. If the vendor says we use ChatGPT, you then should ask the following – “Are you using the GPT model or just ChatGPT?”
Reason: If they are using the GPT-models, then they may be or thinking about using down the road, GPT-3.5 (referred to as ChatGPT) or GPT-4 (referred to as GPT Plus). Once OpenAI launches GPT-5, that slides under the model.
Q: What are you using to strip out PII?
A lot of people never ask this question, and I have found vendors who have added GPT into their learning system or learning technology, that cannot respond to it. If they say it is in the LLM – ask how many Pii items can the LLM strip out. There are Pii vendors out there, whose Pii solution is a layer that goes over the LLM.
There will be vendors who pitch Private Gen-AI or tell you that you are able to add all your materials, information (whatever it might be ) files, and so on, from your company. So if you are in L&D or Training, this would be your training or learning materials, PDFs, guides, courses (in a couple of cases), and other types of content. If you are in HR, all your content.
Private Gen-AI premise is that it is your company’s data and not public. Thus it is private. No public. No net, no nothing – just within your own business/company. This protects – think security and privacy.
What they fail to note
Even with your own materials, content, data, and so on, you will still have hallucinations. You may have bias too – since it comes from your content. Yes, the system learns from this, and so forth, but hallucinations are an inherent component of any LLM on the market. If a vendor says, it does not produce hallucinations – they are either lying or clueless. A vendor might tout they have guardrails in place to reduce hallucinations significantly – but nobody has removed 100% of hallucinations. There are plenty of experts in the AI space who believe that hallucinations will exist in LLM years from now.
I hear vendors pull the spin of higher accuracy with private, which gives you a false sense of security or hope. Right now, the accuracy of private data is between 95% to 98%. Vendors will push numbers such as 97% or 98%, as though this is great. The problem though is that this isn’t a test or exam you are taking. This is a form of neuromarketing, because of how your brain thinks. Focus on the minus – 2% to %% of the information outputted will be a hallucination. I want to stress this doesn’t mean everything your employees see will be false or fake. However, it will appear with some responses.
Which is why a human element must be involved. Always. Thus it is crucial to let your employees know that they need to re-check and verify before submitting or placing those responses into whatever they are working on.
When you have an LLM, you need a human or humans to maintain the LLM, check the data, and so on – all the time. But I digress, and that is another topic for another day.
Hallucinations – Fake or False information that the AI will push out. There are plenty of use cases or examples, out there. I have heard vendors say their AI is infallible. It’s not. I have heard people say, well my employees know that hallucinations exist. In my own research, I overwhelmingly find that is not the case. Even with companies that are using their own private data – they erroneously believe no hallucinations. And even no bias.
Can you untrain the LLM?
There is likely going to be times when you see errors or want to remove some of your data files from the LLM. The term for actually doing this, but I can’t recall it; the good news is for you to think of it as – un-train the LLM. The bummer for folks is that you cannot un-train all of it. Thus, you will have some remnants of your data files in there. Anyway, a lot of people just think we can remove and re-train that LLM. Nope, it doesn’t all go away.
A few questions to ask. A few hey did you know that to be listed. And the premise that a human is never needed to validate the information being presented – which in fact, you will need at least one human. Always.
Remember Gen-AI which is training data sets to generate text, images, video, audio, and so much more – down the road; is totally going to revamp and change the not just everything, but for us, corporate learning is a huge way.
The challenge I see today isn’t the number of learning system and learning technology vendors who lack Gen-AI (it makes sense, too early stage), nor that they are targeting it on their roadmap for 2024 – which makes 100% sense; no, it is the lack of planning around the realization that at some point, a company wants to connect their LLM to the learning system’s or learning technology LLM.
It isn’t going to be today or tomorrow.
Perhaps it will be an integration that doesn’t take a lot of time (ha), but what happens when that company has a different LLM, than yours?
Because it will happen.
And unless you understand the market for finding such a connection (and a few already exist), you are going to find out
That preparation and knowing the right answers in this case,
Is going to require you to go beyond
Asking Bing, Bard, or even down the road
Meta’s future LLM.