Skip to content

Commit 64fed36

Browse files
committed
Get all CPAN distributions using scroll API for MetaCPAN
This uses the ElasticSearch scroll API to get all CPAN distributions <https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-request-scroll.html>. Fixes <#1961>.
1 parent 82926d2 commit 64fed36

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

app/models/package_manager/cpan.rb

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,16 +13,20 @@ def self.package_link(project, _version = nil)
1313
end
1414

1515
def self.project_names
16-
page = 1
1716
projects = []
17+
size = 5000
18+
time = '1m'
19+
scroll_start_r = get("https://fastapi.metacpan.org/v1/release/_search?scroll=#{time}&size=#{size}&q=status:latest&fields=distribution")
20+
projects += scroll_start_r["hits"]["hits"]
21+
scroll_id = scroll_start_r['_scroll_id']
1822
loop do
19-
r = get("https://fastapi.metacpan.org/v1/release/_search?q=status:latest&fields=distribution&sort=date:desc&size=5000&from=#{page * 5000}")["hits"]["hits"]
20-
break if r == []
23+
r = get("https://fastapi.metacpan.org/v1/_search/scroll?scroll=#{time}&scroll_id=#{scroll_id}")
24+
break if r["hits"]["hits"] == []
2125

22-
projects += r
23-
page += 1
26+
projects += r["hits"]["hits"]
27+
scroll_id = r['_scroll_id']
2428
end
25-
projects.map { |project| project["fields"]["distribution"] }.uniq
29+
projects.map { |project| project["fields"]["distribution"] }.flatten.uniq
2630
end
2731

2832
def self.recent_names

0 commit comments

Comments
 (0)