Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PyPI is a "dumb" index, in the sense that it doesn't really offer opinions on which library is best for a particular task. IMO (and my opinion doesn't actually matter here) this is the right approach for the overall OSS community: curation is a hard (and mostly manual) task, in contrast to most development on PyPI (which is aimed at decreasing (or at least sustaining) maintainer burden).

That being said, there are commercial offerings for these sorts of things[1].

[1]: https://cloud.google.com/assured-open-source-software



+1 for this approach (and thanks for all your work on PyPI William!).

FWIW, I think it's worth clarifying that PyPI is already involved in malware detection and takedowns (as are almost all the package registries). The curation that commercial vendors offer is a little more nuanced than excluding known malware (for example, allowing users to restrict their downloads to a "known good" set of packages, rather than "only" excluding "known bad" ones).

https://warehouse.pypa.io/development/malware-checks.html


The PyPI admins (including Dustin, who wrote this post) do way more work than me, much of which is on a volunteer basis. They deserve way more credit than I do for PyPI; I'm just the lowly contractor on a few security features :-)

And yes, that's an important distinction to make! PyPI does indeed "curate" in the sense that its policies include spam and malware removal, and a great deal of automated and manual triage work goes into that.


> That being said, there are commercial offerings for these sorts of things[1].

I had no idea this was a thing, it looks super useful. Anybody familiar with any similar offerings from non-google companies? Or is anybody doing it for npm packages?


Then outsource the curation to someone else by letting people create a feed of trusted packages that pip can be configured for or something. Then we can point pip at pip.name.com/packages.json instead of having to host an artifactory for simple use cases.


This is something you can already do: you can host whatever curated package view you'd like using the Simple Repository API defined in PEP 503[1]. That PEP doesn't require that files be hosted on the same origin as the index (which isn't the case for PyPI either), so you could perform the curation there by re-using PyPI's CDN URLs.

[1]: https://peps.python.org/pep-0503/


Thanks for the reference! Being supplied through GCP is very applicable for me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: