libtracker-miner: Perform leveled notification in TrackerFileNotifier
The current notification process involves crawling over index roots without restrictions, and querying the state of every file in the store. This is fastest, but can get memory hungry on huge directory trees. So split the process in 3 sequencial steps, that are repeated from top to bottom over the directory hierarchy: - A directory is crawled, contents that currently exist in the filesystem are extracted. - Only if the directory is an index root, or was checked to exist in the store through previous iterations, the directory and all contents found are looked up on the store by their uri, new and updated contents are detected by comparing mtimes. - Only if the directory passed #2, and its mtime changed (which usually implies something was added or removed, at this stage we only have to care of the latter), query all elements in the store that nfo:belongsToContainer to it, and check for those files that existed in the store but don't exist anymore. Deleted contents are detected in this stage. The change has been done so there is certain compile-time granularity on the directory processing, currently controlled through the MAX_DEPTH define. This switch controls the maximum depth on crawled/queried chunks, which establishes some indirect limit on the number of GFiles (and all misc data around) that are in memory at the same time. From testing, first-time crawling performance is completely unaffected, and second-time crawling on an unchanged directory tree has negligible decreases. The IN() match on an indexed property like nie:url looks near constant, and the third more expensive step will only happen when it is very likely that there are actual changes to process. So the MAX_DEPTH value has been set to 1 to keep memory usage to a minimum (tracker-miner-fs now peaks on massif at 24MB when it previously early grew to ~180MB, indexing 11304 folders and 123428 files)
parent
68559df9
Please register or sign in to comment