magic_lobster_party

  • 1 Post
  • 155 Comments
Joined 4 months ago
cake
Cake day: March 4th, 2024

help-circle





  • Recommendation systems don’t need to be that complicated. In its base form it’s just a list of videos you’ve watched (or content creators or topics). It can then be compared with the watching lists of other people to get an idea of what else you might be interested in. No need for any advanced video recognition.

    Maybe this list is isolated within a single instance. Maybe it can be shared between instances. Different instances might use different recommendation systems.

    Again, it might not work as well as YouTube’s, but I don’t think it needs to.


  • Recommendation systems are well studied. I don’t think it’s unreasonable to add some form of recommendation layer separate from (or on top of) the content delivery. It doesn’t need to be up to par with YouTube’s, at least before there’s any major content.

    Most YouTubers rely on sponsors or Patreon. Podcasters are doing the same - many of which are self hosting. So I don’t think an ad delivery system is that needed.

    I don’t see how it would have to work much differently compared to how Pocketcast or Overcast already works.

    The first problem is getting content to the platform.




  • Easy solution: host an FTP with all the videos. It has existed long before YouTube was a thing.

    More advanced solution: Torrent ala Pirate Bay. High quality videos have been distributed this way long before YouTube even supported 1080p. Peertube is based on similar solution as this.

    The main problem is to attract content creators to the platform. The problem isn’t technical.


  • https://www.nature.com/articles/nmeth.4642

    This article use different wording than me, but in essence: Statistics is mostly about using a known model to explain the data. Machine Learning is mostly about finding any model that predicts the data well. Different purposes with some overlap. Some statistical methods are used in Machine Learning, but that doesn’t necessarily mean all of Machine Learning is statistics.

    The boundary between statistical inference and ML is subject to debate—some methods fall squarely into one or the other domain, but many are used in both. […] Statistics requires us to choose a model that incorporates our knowledge of the system, and ML requires us to choose a predictive algorithm by relying on its empirical capabilities.

    Another (potentially lower quality) article that is not from Nature, but discusses the meme in particular:

    https://www.datarobot.com/blog/statistics-and-machine-learning-whats-the-difference/

    Because of the large number of variables in machine learning datasets, the models developed from them can be simultaneously extremely accurate and almost impossible to understand. Statistical models, on the other hand are typically easier to understand because they are based on fewer variables, and the accuracy of relationships is supported by tests of statistical significance.








  • If parameters aren’t neatly interpretable then it’s bad statistics. You’ve learned nothing about the general structure of the data.

    Linear regression models are often great tools for explaining the structure of the data. You can directly see which parts of the input are more important for determining the output. You have very little of that when using neural networks with more than 1 hidden layer.