Tuesday, November 8, 2011

Artificial Intelligence (and Pattern Based Computing) to dramatically change Healthcare and other professions/industries

I was reading an article I read about how Artificial Intelligence and by extension Pattern Based Computing is becoming a huge disruptive change to the economy (http://www.economist.com/node/21536460). I tend to agree that Artificial Intelligence and Pattern Based Computing are changing industries and jobs. That's kind of the point. If algorithms can detect tumors faster and better than radiologists and can evaluate case law better than paralegals and lawyers than maybe they should. If Pattern Based Analytics can find patterns better than a fleet of analysts maybe they should. It's the same old argument against robots in factories. It does take away jobs from people but that is a very short sighted view. Because of automation in factories there are many fewer people working in factories than they used to (lets ignore outsourcing to countries who are actually increasing factory jobs). Instead people do other things. They work fewer hours in nicer environments doing more mental work than physical work.

So now AI is hitting its stride and has made it so that tedious searching and analyzing can be done without forcing a person to slug through it all day. I say that's great. In the short term it may hurt a few well trained and highly paid people. Whole professions may disappear. But in the longer view this is a good thing. People will work even fewer hours in even nicer environments. Maybe from home or wherever they want. And when freed from mind numbing grunt work more people will have time to put their energy into other things like being creative, something AI still has a long way to go to replace.

Agricultural workers became factory workers, who became white collar office drones, who will become something else. Maybe now is the time for the rise of the creative class of artists, musicians, and performers. Maybe they'll finally get paid more. Change can be disruptive and difficult but change is a good thing and we are built to adapt.

Monday, November 7, 2011

Domain Experts and Analytic Experts

So I was reading the discussion "Small data" in a "Big Data" World at http://www.patternbasedanalytics.org/2011/11/small-data-in-big-data-world.html and started thinking about the experts Michael discusses there: domain experts and analytics experts. Domain experts know how their business/research/processes operate. Analytics experts know how to quantify how different pieces of data interact. The ideal case is when there's overlap between the two skillsets in a single individual, because the delay caused by interactions between multiple groups is minimized. What I want to do here is describe the skill sets for each type of expert and see what kind of role pattern based analytics (PBA) can play.

DOMAIN EXPERTS: know what data is available, what the data means, have a intuitive or explicit understanding of what happens and why, have a good sense of how to know when something isn't right and what questions to ask, and are able to validate or invalidate conclusions based on their judgement or knowledge.

ANALYTIC EXPERTS: know how to perform data selection and preparation (this could be the process to reduce "big data" to "small data"), know how to perform analysis to discover relationships between variables for the purpose of gaining insight or making predictions, and are able to interpret, present, and report analytic findings to others, including actionable information.

PBA is well suited to handle the middle and last phases of the analytic expert tasks to make data-driven decisions in a more agile sense. Since domain experts are typically closest to the "action" of their domain, they are best suited to identify opportunities and propose solutions in their fields. With PBA on their laptop or workstation, substituting for an analytic expert, the domain expert can find, visualize, and explore the most influential relationships between datapoints in the data they know best. Then, armed with their conclusions, they are in the best position to report what they've learned and have the best information about what action to take. This may not be the ideal case, but it does make data analysis possible for domain experts, and help shorten the decision-making cycle for those who may have to wait for the analytic experts.

Are there any other skills which might be required for someone to be one of these experts or to make them more effective?

Tuesday, November 1, 2011

"Small data" in a "Big Data" World

"Big Data" has received a ton of buzz in recent years, and with good reason. Big Data is becoming more the norm in many industries, and it is important to be able to turn that data into actionable insight and knowledge. However, many datasets in business contain loads of information themselves while missing the moving target of being "Big Data." We should not forget that these datasets might lead to just as much insight as their bigger siblings.

Before Big Data became the latest buzz word, many people were doing more traditional analytics on their desktops. In the advent of the Big Data age, I feel as though people assume that small data is a solved problem, but that is far from the case. Experts in analytics have been able to provide answers to small data problems for years, but something will always be missing so long as the domain experts are separated from the process. The key to better solutions for small data problems is to provide a an analytics platform powerful enough to provide real answers but simple enough that domain experts can use it to ask the right questions.

Thursday, October 6, 2011

What kinds of analysis of Twitter data is interesting?

As a follow up to Josephs post on the Pattern Based Analytics group about decoding the data in Twitter chatter, we know there is some information that twitter can provide better or at least faster than virtually any other source. For example, news of an earthquake travels faster than the earthquake does. But besides that what else is interesting? News about celebrity baby bumps and trips to rehab aren't what I would call interesting to the general public.

Tuesday, October 4, 2011

The Ultimate Question

In The Hitchhiker’s Guide to the Galaxy, aliens devise an elaborate computer system to find the ultimate answer to life, the universe, and everything. After millions of years of computing, the ultimate answer is finally revealed: 42. This nonsensical answer made the aliens realize that they found an answer without really knowing what the question was (so inevitably they built an even more complex computer to determine the ultimate question).

Unfortunately this is not too different from how analytics works in some organizations. Those with the statistical knowledge and computing resources work very hard finding answers, but those answers might not match the questions that really matter to the business. When domain experts see such 42-like answers to their questions, they can become frustrated and turn away from analytics.

Asking the right questions can be as important as finding the right answers. An analytics process should reflect this reality. Domain experts should be able to ask good questions, and the good answers they receive should then spark even more good questions. PBA is one way to empower those with the right questions with the ability to receive good answers.

Thursday, August 4, 2011

Pattern-Based Applications

I found an article a couple of days ago headlines "Brains And Bots Deep Inside Yahoo's CORE Grab A Billion Clicks" (http://www.fastcompany.com/1770673/how-yahoo-got-to-a-billion-clicks). In short, Yahoo goes to a lot of effort "to figure out how to serve up news that you, yes you, will find irresistible".

The process the article describes reminds me a lot of how one develops a Pattern-Based Analytics (PBA) application.
  • To start, can you define the application in terms of a question? In Yahoo's case: "For the stories I have available, which ones should I show the user so they are most likely to click on one?"
  • Do you have data to support the question? Yahoo "grabs a portion of Yahoo visitors as they arrive on the site and uses them as a guinea pigs, tossing some of the new packages at them and seeing what attracts their interest. (Who gets lumped into that bucket is determined by a virtual "flip of the coin," so that the pool is not always populated by the same people.)"
  • Do you have data to discover patterns against? According to the article, "Yahoo generates a profile for each user based on information they've entered about themselves, like gender and age (if they're a registered Yahoo user), the places they've visited when they've come to Yahoo in the past, and the stories they've already seen during that particular visit."
  • Once you have your data, it's easy with LeapWorks PBA to discover and validate the patterns related to your question. In this story, Yahoo uses their analytics to _augment_ their editors, not replace them. The editors retain the ability to override algorithmic suggestions, but their system lets them know _why_ decisions are being made. This way, Yahoo's editors can incorporate new insights into which stories go to which users, which results in more clicks (hence the article). This is a key feature of PBA - transparency about where patterns come from. Without knowing what's going on under the hood in a computer's "train of thought", the experts will never fully trust the system.
  • If you know something will happen before your competition does, you have an advantage - one common PBA application is to gain an edge by realizing insights before others do. In Yahoo's case, "Yahoo’s editors get a jump on upcoming stories. Months before the wedding of Prince William and Kate Middleton, the Front Page team noticed a rising interest in the bride's sister, Pippa. So while it took the rest of the media a few days after the event to catch up to the frenzy surrounding the maid of honor (and her headline-grabbing bum), Yahoo’s editors were ready from the beginning."

In short, some PBA applications can be thought of as the process of
* framing a question
* collecting data to generate an insight
* using the insight to understand why certain things happen
* take advantage of the why

The question is... What is the question?

Friday, July 15, 2011

Healthcare, Marketing, and Patterns

Marketers often try to find patterns in their data in order to better advertise, and this got me thinking about connections to how health care professionals prescribe treatments.

We’re all familiar with companies sending email to market their products. What many don’t realize is that these companies leverage their data to try to discover patterns in what ads might be beneficial to you. This process is severely limited by the type, quantity, and quality of data available about a particular customer. For instance, I’ve never purchased any products through Groupon, so the only data they have for me is my Zip code. So although I thought it was silly that they recently sent me, a single male, an offer for a women’s fitness center membership, they merely lacked the data to do any better. Amazon has been better for me, but not by much. I only purchase products there a couple times every year, so each individual purchase carries more weight than it should. Just because I once purchased a television there does not mean that I am interested in seeing ads for TVs every month. Although it’s possible their algorithms could benefit from some newer concepts from pattern based analytics, their real issues are a lack of data about me. If I were to patronize them more frequently, they could offer ads more representative of my interests.

So how does this relate to healthcare? Well just as I don’t patronize Amazon more than a few times a year, most people don’t visit their physician more than a handful of times a year either. The physician might have additional data in the form of family history, but that is always changing and tends to be incomplete. The intricacies of why “personalized medicine” has yet to prosper are beyond the scope of this conversation, but I suspect that data quality and quantity has been a hindrance. Better, more complete data about a patient’s health history could lead to a better understanding and therefore better treatment options and preventative measures.