Thinking Through Data (Notes from May 13 - 19, 2019)

This week we had a hefty dose of medicine, biotech and data discussions. However, it isn't just focused on the technology and innovation for these incredible people, but rather the frameworks of the thought process that goes in to their decision-making. We can ask "Where did we go wrong?" or "What assumption do we make that should/would be challenged?"Annie Duke is excellent in assessing this on the spot, but she sheds light on the steps she takes to get to that level. Much of what we believe / view tends to be outcome-based, which is fine because that's easiest to see. However, it's rarely consistent in the path to get to that success point once we change the problem. That's what we should focus on - consistent thought processes to make strategic decisions. Outcomes are one metric of measuring the decisions, but not an end-all, be-all.AI mixes with a few of our guests, including Mir Imran and Eric Topol. How does it play a part in how we will progress? What may AI teach us about our historical research? Are there more optimal methods of delivery?Emily Oster and Hanne Tidnam discussed societal effects of technology and decision-making. With technology becoming more and more ingrained as a part of everyday life, is there a point-of-no-return or is it okay generally? Or, is there a limit? Time will certainly tell - studies / surveys don't have the information available to give us any insight since the time-scale is so recently biased. Technology has outpaced the level at which we can make data-driven decisions in this manner.Bruno Goncalves discussed text mapping and language processing for how different dialects of the same language can indicate demographic patterns in various locations (focused initially in Brazil). He mentioned it was interesting to track words / phrases and how they changed throughout history for English, primarily - harder to track Spanish. If there's more data available (say, mobile text / audio or email?), may be easier to break down by parts of towns/cities, but currently, it was limited to general, larger blocks based on Twitter text (which is sparse, generally, anyway).Suranga of Balderton Capital discussed his movement from London to San Francisco and what he'd observed as a tech employee, then executive and founder and the difference between the environments. His excitement for the future comes from the societal change that infrastructure of technology may enable. Then there was a darker side when we heard the author of Ghost Work - Mary Gray - as she said there is a larger split of work designed for specificity as we improve the AI / ML models that have been deployed. What is needed work-wise or what can be automated? Are those good? Biased? We'll see.I hope you enjoy my notes and hopefully you'll check out a few that sound interesting! Thanks again to the people that continue to keep me up to date with what I love to learn.

  • Eric Topol (@EricTopol), author of Deep Medicine, cardiologist (Wharton XM)
deep-medicine-cover
  • Deep AI and medicine - asking how to properly apply deep learning or AI to medicine
  • Yet to have a drug made by AI
  • Bayesian principles failing in medicine community 12% of women get breast cancer, yet we ask them to do 1-2 years mammograms 10 year span, 60% will get a false positive (yikes!)
  • Adopters of Apple Watch and its cardiogram / heart information are treated unilaterally Young, healthy and curious people may not have any arrythmia so any abnormality (totally normal) may be misdiagnosed Signal from heart rhythm from one person may be different from another
  • Says it's a sign of being behind (3rd industrial vs 4th for the rest) when Stethoscope (invented 200+ years ago) is still sign of industry
  • Analog, no option of recording and is still very subjective to the doctor listening
  • This, despite advances in imaging and scans otherwise
  • Mir Imran, Chairman & CEO of Rani Therapeutics (Bay Area Ventures)
21bf244196ea49469935c6e51910eec6
  • Talked about the big equity stake his fund took in the company, wanting it to be a big hit
  • Drug development, say, an insulin pill compared to injections (spent 150+ years on solving this) Stomach / pill form in past is only 0.5% or < 1 % efficacy - can't intake that many pills or cost-effective
  • Designing pill that is pH-shell-dependent to identify intestine pain receptors are limited (can inject no problem)
  • Pill recognizes pH change by dissolving outer shell (to pass thru stomach), then sugar-needle injects
  • 1000s of animal trials and just passing pills
  • Started trials ~18 months ago in Australia for pill traveling (using x-rays every 30 min to track) Then progressed into drug for giantism, along with looking at other biologic diseases / solutions
  • Don't want to limit biologics to a single buyer (Novartis, Genentech, etc…) - keep it open Investments from Novartis and Google initially ~$150mln worth now
  • Mapping Dialects with Twitter Data (Data Skeptic 4/26/2019)
  • With Bruno Goncalves about work studying language - now @ Morgan Stanley
  • Started with research in CS and Physics, moved eventually to Apache weblogs, email, big router logs, Twitter conversations to study human behavior
  • Turned into looking at Twitter check-ins with the log using longitude and latitude
  • Language used: order maps, areas dominated by specific language (drawing boundary between French and Belgian in Belgium, for instance) Intermingling cities that attract many languages
  • Spanish changing from one areas to another - everyday words, phrases
  • Can use the location data to determine the area of dialects - splitting Brazil, for instance (South, North and then central American)
  • Dividing grid cells into km x km - maybe not determinate of gradients of English vs Spanish since they were testing dialects of Spanish
  • Each row corresponds to each cell and words, but the matrix essentially loses the meaning Ran PCA analysis and K-means on the clustering
  • He's gathered 10 tb of data from Twitter, corpora and looking at millions of tweets - too few data to look over time
  • Measuring language changes over time was difficult for Spanish, but easier with English
  • Used Google books, for each language and counting bi-grams in how many books - popularity of words
  • Corpus of books published in UK vs US for Google books (1800 - 2010) for reliable data, but further back was less popular
  • Could normalize for words on how American vs British words were (and the mixture)
  • Recently, looking at demographic splits of language now with more digital / online presence
  • Training spellcheck based on dialect or demographic splits
  • Doing this stuff part-time now - how to train a model to detect use of language in this sense Using word embeddings to detect automatic meanings of slang to determine the different meanings
  • Matt Turck with Plaid co-founder, Podcast
  • http://mattturck.com/plaid/
  • CSR in Gaming Industry, (Wharton XM)
  • Discussing how it can be weird to gain competitive advantage and share
  • CSR as large subjective to the person looking in
  • Gaming companies chip in to gambling addiction hotlines / help / etc
  • Particularly in Las Vegas / NV, CSR survey determined that direct opportunities in water conservation, energy and green energy How can they more efficiently run such large operations
  • Survey had about 80% of the gaming industry represented, from servicers to manufacturers to casinos themselves
  • Suranga Chandratillake, GP at Balderton Capital (20min VC 090)
balderton-capital-logo-1
  • Founded blinkx, intelligent search engine for video/audio content in Cambridge in 2004 Lead company for 8 years as CEO through journey to SF, building a profitable biz and going public in London
  • Early ee at Autonomy Corp, engineer in R&D and then CTO when he went to SF
  • Belief in technical person shouldn't be CEO - idea has been replaced in US (probably split) Jack Dorsey, Zuckerberg (went into Dorsey and the idea 2 companies that aren't really overlapping)
  • Been impressed by both Gates and Zuckerberg
  • Knowing the code and going through to understand everything
  • Zuckerberg's acquisition of Instagram (ridiculed for $12bn and how right he was - surpassed Twitter's users)
  • 2 Things for Tech going forward - changing infrastructure
  • VR and including drones or autonomous vehicles - hardware progression and mobile phones for what CAN be done
  • Societal difference - how we change to gig economy or how we work and spend time each day
  • Changing investments - just announced investment in Curious.ai Shorter term where society is changing (eg: Nutmeg for planning pension plans, cheaper and available)
  • Structure for Balderton as equal partner VC - not sure it's the most efficient model, but thinks it's the best
  • Equal partnership where the partners have all the same number of votes, compensation and identical split Removes the politics and impacts the partners and the stakeholders (founders that may be affected)
  • No single group that can carry the vote - hiring becomes very difficult because you need an equal
  • Knowing when to stick and when to switch Deciding when to say "We don't do that" and pivoting and being right
  • Exciting company for investment that he's done recently - Cloud9 All development being done in the cloud now
  • Nick Leschly, CEO Bluebird Bio (Wharton XM)
bluebird-bio
  • Purpose Built, 5/14/19
  • A Guide to Making Decisions (a16z 5/12/19)
  • Emily Oster, Hanne Tidnam discussion
  • Health and personal spaces - "right" thing isn't a difference for some Cocoa crispies, eggs and the preference for why you eat them, not just health-related
  • Impact of breast feeding vs not and looking against obesity
  • Less likely to be obese if breast feeding
  • Random trial for this is from Belarus in the 1990s where they requested parents to some breast fed
  • Siblings in choice to breast feed - almost no correlation if they did or did not
  • Constrained optimization - money or time
  • As parent, making choices based on preferences but have constraints
  • Recency bias for current literature - only data will be problematic and that it can only support a relation
  • Regression discontinuity as drawing a line and then seeing interventions and how they may change or affect outcomes
  • Screen time as having evidence that's just very poor
  • Not easy to run randomized trial on - iPad or lifestyle would have to change for trial
  • Apps in short term doesn't have the info available for results / outcomes
  • Studies currently don't have any conclusions on phones/tablets - little
  • Bayesian updating on uncertainty - if 2 year old spends 50% of waking hours, that's bad. Where's the limit?
  • Last time went to doctor - very concretely made a recommendation of "2 hours max in a day"
  • How does the information get to that level of the system?
  • Conversations occur with doctors where they essentially make the lines that stick, pass on
  • Recommendations go out as arbitrary but now are made as truth - terrible without the decision People that take the recommendations are likely more different or health-conscious Added another layer (vitamin E, for instance) - those that adopted did many other things
  • Just be wrong with confidence - can be right, lots of good options
  • Innovating in Bets (a16z 5/8/19)
  • Annie Duke (@AnnieDuke), Pmarca, smc90
annie-duke-thinking-in-bets-199x300
  • Every organization and individual makes countless decisions every day, under conditions of uncertainty.
  • Thought experiment: posing Seahawks run the ball and are stopped vs passing and throwing pick
  • Once there's a result, it's very difficult to work backwards to assess the decision quality was. Outcome was so bad - it was in the skill bucket. Or - oh, there was uncertainty.
  • Very slow in NFL, for instance, to adopt analytics according to the numbers
  • 4th and 1 - should unanimously go for it since numbers say the other team will get 3, expected
  • As a decision maker, we likely choose the route that keeps us from getting yelled at
  • We allow uncertainty to bubble to surface - conflicted interest - long run vs short term
  • 2x2 matrix of consensus vs nonconsensus and right vs wrong
  • CR fine, NC-R genius, CW - fine (all agreed), NC - W really bad
  • Outcomes are right and wrong - hard to swipe the outcomes away
  • Thinking or allowing uncertainty in human driving and killing a human vs autonomous vehicle Black box not understanding autonomous vs THINKING understanding another human ("Didn't MEAN to do it")
  • Anybody in business - process, process, process
  • R/E business - everybody in room after appraisal is 10% lower vs 10% higher (similar analysis)
  • Outcomes come from good process - try to align that an individual's risk matches with corporation's Rightest risk - model could be correct (tail result) or risk decision (and deploying resources)
  • Most companies don't have SITG - how to drive accountability in a process-driven environment so results matter?
  • How do you create balance so outcome caring about is the quality of forecast? State the model of the world, places and what you think. How close are you to the forecast?
  • When you have a bad outcome and you're in the room? How many times do people say "Should we have lost more? Should we have lost less?"
  • Learning loss - negative direction on how to figure out? Poker example of betting X, X-C or X+C Bet X and get a quick call. Should've done X+C (learn, regardless of opposite getting a card)
  • How much can you move an individual to train thinking? Naturally, thinking in forecasts more. Will have reaction and lessons quicker.
  • Getting through facts quickly and having the negative feedback - not robots
  • Improved 2% which is amazing - what are YOU doing to not make it worse?
  • "Results-oriented" as one of worst for intellectual work - need a process.
  • Story of "something is happening" is not a good story. Hard to read journalism now for him because they're both "non-consensus".
  • She's optimistic that people can be equipped to parse narrative to be more rational. Pessimistic of framing and storytelling currently.
  • "Not making a decision" is making a decision but we don't think of it that way. Really unhappy in a position. Did the time frame thing "Will you be annoyed in 1 year?" - Yes. Nondecision didn't feel like decision.
  • Mentions not having kids - time, energy, heart and decisions associated with indecision.
  • On individual decision, you have: clear misses, near-misses, clear hits
  • Bias toward missing - don't want to stick neck out. Have to see it in aggregate. Forecasts.
  • Anti-portfolio / shadow book is really about when you include clear misses vs near-misses.
  • Fear is that the ones that hit are less volatile or less risky will be returning less than shadow portfolio.
  • Says it's difficult to do with 99:1 turndown vs investments. Sampling. Time traveling with portfolio - bounce out and see if "if this is 1 of 20 in portfolio" vs "invest vs not"
  • Conveying confidence vs certainty
  • I've done my analysis. This is my forecast. Is there some piece of information I can find out that would change my forecast? Bake it into the decision. Not modulating the forecast - 60% to 57%. Costs and time differences.
  • Putting confidence interval on earliest dates and another probability, inclusive, on latest day
  • "I can have it to you on Friday 67% of time and by Monday 95% of time."
  • Terrible to ask "Am I sure?" or others "Are you sure" compared to "How sure are you?"
  • Pre-mortems in decision analysis
  • Positive fantasizing vs negative fantasizing (30% closer to success by thinking of hurdles in front of you)
  • What happens if you ask out crush and they say yes? What happens if you ask out crush and they say no? - No's more likely.
  • Teams get seen as naysayers - individually write a narrative on pre-mortem - good team-player is how do you fail?
  • After outcome, we overweight regret - don't need to improve this much. Look out a year and see if it affects you.
  • Mary Gray, author of Ghost Work: How to Stop SV from Building a New Global Underclass (Wharton XM)
51zsewdgdol._sx331_bo1204203200_
  • Research with coauthor at Microsoft Institute
  • Discussion of the b2b services that run in the background - used Uber's partner as example for driver verification ID Beard vs no beard on default, for instance - need close to 100% accuracy and algorithms/vision can't pick that up just yet
  • Humans that run mechanical turk or various tasks on classifications
  • Technology used to be cat vs dog and now it's species of dog - likely tech will enable more specific classifications but not remove intervention Running with surveys, captioning, translations, transcription, verifying location, beta testing are all tasks
  • Used another example for "chick flick" meaning or 2012 debate with Romney's "binder full of women" comment Needed humans to assign relevance and to provide or connect proper context (Twitter, for instance)
  • Facebook and its content moderators, most social media companies do this
  • Trends in Blockchain Computing, (A16z, 5/18/19)
  • Get paid for AI data or encrypted data - can still train a model on it but nobody would know exactly what the data is Long term accrual - depends, also, on if it's on AWS vs open-source
  • Gold farming and ex-protocol websites in WoW, for instance - virtual worlds