Spotlight & DEVONthink

Hello

Not only does Spotlight not find any text on the pdfs I have stored in DEVONthink (and they have all been OCRed by Acrobat), but neither does it find any text on the various rtfs containing important information that I have stored there either. My Powerbook therefore has one big dark area that Spotlight cannot penetrate, and it is an area where I keep much of my really important data. How do I fix this? If I cannot penetrate DEVONthink with Spotlight, then I can see no other option but to trash DEVONthink, after three or more years of use.

Matthew Whiting

Spotlight doesn’t index inside of bundles? Whoa… I didn’t know that :confused:

As for the RTF, the devs have commented (unless I’m on crack) that the database starting in DT2 is going to store the RTFs externally, like the PDFs.

Of course, that doesn’t solve the problem of Spotlight not indexing inside of the db directory, but then that’s supposed to change in v2 too, isn’t it?

That’s correct. Spotlight can’t index the current DT Pro database contents, as has often been noted on this forum.

All of the information that I consider important goes into one of my DT Pro databases, where I can search for it orders of magnitude more quickly than Spotlight, and with the results located in a much more powerful working environment.

But DT Pro version 2.0 will allow Spotlight indexing of database contents. :slight_smile:

I just tried some more searches, on Word documents I’ve had in DEVONthink for a while. DT fails to find anything. Spotlight finds the corresponding documents in my Documents folder instantly. Neither does DT manage to classify whatever I throw at it. There are 2000 documents stored in DT covering only limited fields. I was very keen on DT when I first got it, but after several years it still appears to be only partially functional. Have DEVON technologies taken their eye off the ball?

Matthew Whiting

Does search always work (if you’ve got the search parameters set properly)? Yes.

I’ve got over 20,000 documents in my main database. If a search query doesn’t produce something that I know is in my database, I’ll examine the settings in the Search window – and find they were too limiting. Perhaps I had the setting for a Name search rather than an All search. Or perhaps I had limited the search to a subgroup rather than database-wide. That’s why I always use Tools > Search rather than the search field in view and document toolbars. All of the search operator settings are visible.

Straightforward searches usually take 50 milliseconds or less on my computers. More complicated searches may take tenths of a second. And the search results are immediately available in the DT Pro working environment. My databases are self-contained, including of course all my own writing and Web captures, which are not indexed (in the current version of DT Pro) by Spotlight. So about the only time I use Spotlight is when I’m looking for .plist files. As noted, DT Pro 2.0 will allow Spotlight indexing. And what I’d really look for is having a Spotlight search directly open in DT Pro a selected Spotlight search result, making it available in the rich environment of DT Pro. That would also allow searches across multiple DT Pro databases.

Example: I just did an All/All Words/Exact/Database search for “Charles Thomas Huxley Darwin”. I wanted to look at contents that mentioned both Charles Darwin and Thomas Huxley. The search produced 5 results in 1 (one) millisecond. (The query operators that will be in DT Pro version 2.0 would let me phrase the query more precisely.)

About Classification. When you click on the Classify button for a new addition, DT Pro will compare it to the documents in your existing organizational structure. If it doesn’t find a close match, it will not move the document (especially under the Auto-Classify setting). That’s good. DT Pro doesn’t make “random” filings of your documents. And that’s a cue that you might want to add new structure to your organizational scheme. Sometimes I’ll move documents that don’t fit anywhere in my current organization into an “Unclassified” subgroup. When I’ve got time, I’ll do some organization of that material manually, perhaps adding new groups or subgroups. Or I may select the contents of that Unclassified group and select Auto Group, to see if DT Pro can suggest useful groupings of those documents.

Hi Bill

Thanks for taking the trouble to reply. I must admit that I haven’t been using DEVONthink very much lately as I have been working on projects that do not involve the kind of data management that DT offers. However, I do have a long-term project for which a great deal of the data is in DT, plus most of my important personal data is stored there. I guess I was expressing my frustration that since I am not always within a DT environment I cannot find the data stored within it externally. This problem really should have been corrected as soon as possible after Spotlight and Tiger came onto the market.

I do not know what happened with the searching I complained about in my previous post. When I fired DT up again today and did the same searches the results were both instantaneous and correct as you suggested they should be. So, on that one I’m baffled. I will spend some time again reorganising DT and see whether I can really make it work as I want it to.

Matthew Whiting

If you export the files and folders that are in DT, Spotlight will find them. Even better is NotLight, a free search utility published by Matt Neuberg.

tidbits.com/matt/?@485.hTwcbRT0vJ4@

It really is time to integrate with Spotlight. Come on guys. Spotlight is MAJOR CORE functionality built into Mac OS X.

Great Mac Software integrates with the Core OS features.

99.9% of your customers want Spotlight integration, and the other .1 % are just too stubborn to admit it.

The revised database structure of version 2.0 of the DT/DT Pro/DTPO applications will integrate with Spotlight.

The current DT database structure was designed years before Spotlight, and Spotlight cannot identify those files that are contained inside the monolithic database files. It’s nontrivial to change the structure of the DT database in such a way that Spotlight integration will not only be possible, but may add special advantages to DT database users.

Thanks for the reply. Nice to know what is happening and why.

But then, when was Spotlight introduced? While we all wait, and enjoy new betas and features for scanning and e-mail archiving, wouldn’t it be possible to produce a spotlight plugin (like ziplight for searching zip files, and so many others) to ease the pain for the end user until the database is restructured?

I love DEVONthink and use it every day, but it would be so much more powerful with Spotlight search capability, which I also use every day.

No, in the case of ziplight the files inside the archive are actually stored in the Finder. That’s much easier to devise a Spotlight plugin for, than for the text-type files stored in the DT database.

Of course, if you use Index-captures of files on your drive, they are also indexed by Spotlight. Even then, Spotlight won’t currently see files that you’ve created yourself in the database, or have captured from the Web.

In my case, I design topical databases that contain all of the “interesting” files that I have. So I use DT Pro’s search and AI features. And searching is, as I noted, orders of magnitude faster than Spotlight searches, often millions of times faster and reveal search results directly in my working environment. So I use Spotlight searches sometimes to identify files that I might want to move to a DT Pro database, or to look for the locations of preferences, scripts or plugins, especially when I’m uninstalling an application. And I use Spotlight to look for contact, calendar and to-do information, although most of my contact information, with copious notes, e.g. on project activities, is in a DT Pro database.

I’m looking forward to version 2.0 databases that will also allow Spotlight indexing. That will be useful to me for searching across databases (although I usually know which database will contain what I’m looking for). But for me the most important feature of version 2.0 will be greatly enhanced search operators like those already available in DEVONagent 2.0. Add to those features reduced memory requirements and speedup of some operations, which will allow larger but still quick databases. So 2.0 will be a major upgrade.

Personally, I think Spotlight is awful. It starts too quickly, never finds what I want, and doesn’t list files in a useful order. Even the interface, um, stinks. I now skip it and use NotLight, a free utility distributed by Matt Neuberg at tidbits.com/matt/?@336.0dSZbNLPwAd@. It waits for you to finish an input string, offers many searching options, and quickly finds e-mail in an Entourage database.

But I also wonder what Spotlight can do for DT that it does not already do with its fast searching and AI capabilities. Even if you use linked or indexed files, DT will find them, show their paths, and launch them. What am I missing here? I’d like to know.

Spotlight is CORE Mac OS X functionality. Solutions you can make work are often the territory for advanced users and up, however USERS use functionality that just works. Sit at any Mac and you are using the same great built in technology for searching. No need to learn a new system for every keyboard you sit at. Search and search alike on the Mac.

That is a big part of what makes the Mac experience so different. Spotlight just works, and it works quite well. Build smart folders, easy yet complex searches from any finder window, all built right in. No update or tweak necessary. It just works. And this functionality is even now extending to the server and all server contents. Built right in.

Sure, we can all do amazing things on top of this terrific computing platform, but a great program leverages the great built-in CORE of the operating system upon which the rely and reside.

So, do we create complex hoops for users to jump through with solutions cobbled together that we can make work, or do we deliver technology that simply just works? It’s a philosophy and a state of mind.

Yes, Spotlight is a neat addition to OS X. But I hope it gets better. It’s lousy for simply finding files. For that, I prefer the free EasyFind application on DEVONtechnologies’ Downloads site.

As to making Spotlight the core search engine of DT Pro, no way! That would seriously dumb down DT Pro. Compare the working environment in which Spotlight presents search results, to the working environment in which DT Pro presents search results. The latter is much richer and more useful – in short, much more mature than Spotlight.

Version 2.0 of the DT applications will integrate with Spotlight in a way that should add value.

But Spotlight, in its current version, does have shortcomings. See for example http://www.macfixit.com/article.php?story=20061109235901299.

Unless I’m mistaken, NotLight is just a front end for the Spotlight technology built into Mac OS X.

I’d say a search flaw exists when you can’ even search outside your current database. Big big flaw,but you pass that off with a casual remark that you just remember what is in each of your databases. Metadata man, the new search superhero :stuck_out_tongue:

One Mac, one Search. Might be a Bob Marley song in there somewhere.

b.3, I’m amused.

Although I started working with mainframe computers back in the 1960s, there were no document management systems even remotely comparable to DT Pro that were available to individuals, for decades thereafter.

Back in those old days I was involved with publishing several books and a large bibliography project funded by the National Science Foundation.

Back then the primary tools were thousands of hand-written index cards, some organizational skills and a good memory. If I needed to pull a reference, I could usually find it pretty quickly from among 5 or 6 shoe boxes containing stacks of cards organized into groups and held together with rubber bands. I had created those groups by shuffling and reshuffling those cards into a coherent organization.

For several years in the 1970s I chaired a task force of the U.S. states that was created to provide technical input to the U.S. EPA during the development of the Federal hazardous waste regulations. During that process one had to be very familiar with hundreds of pages of draft and existing federal regulations. State/EPA meetings were often heated, and it helped if one could not only remember the regulations in detail but be able to cite the page number of a relevant citation. I don’t naturally have a photographic memory, but one had to learn to pull off tricks like that to have any chance of winning arguments. So I spent many hours going through those regulations, over and over again. I made it a point to understand the regulations better than EPA’s own lawyers. And of course one had to be intimately familiar with the chemical, physical and toxicological properties of hazardous wastes and with the available technologies for managing them. No computer assistance; you had to hold that information in your head, ready to pull out in the context of a discussion.

Projects like that required a great deal of sheer drudgery.

That’s why I’m in love with DT Pro. It lets me quickly jump from mere document management to the much more rewarding and useful level of information management and analysis. Believe me, Spotlight cannot do that. (Did you look at the URL I posted earlier in this thread about some of Spotlight’s limitations? If you Google “Spotlight limitations” you will find hundreds, perhaps thousands more. )

I use and recommend topical DT Pro databases, and mine are self-contained, so they are independent of the computer on which they reside. My databases contain in total more than 100,000 documents. If a file is pertinent to one of my interests, it goes into a database.

It doesn’t take much intellectual rigor to categorize databases. My main working database contains currently over 20,000 documents that meet the reference needs for that topic (environmental science and technology and related material). I don’t need to include in that database my financial records, for example, which is a very large database itself (and gives me instant access to bank statements, photocopies of checks, tax records, etc.). Nor do I need to include in my main database a huge and rapidly growing compilation of detailed chemical analytical procedures, sampling procedures, statistical methodologies for environmental data evaluation and so on, that are secondarily related to my main database. This database looks as though it will eventually be two or three times larger than my main database (so I may ultimately split it into environmental media – air, water, soil, tissue testing procedures, for example.) The level of detail in this database would reduce the performance of the AI features in my main database. But it’s a great reference data set for, e.g. providing resources for graduate student training. Once can compare, for example, approved methodologies used in the U.S. and in the European Union. As it’s self-contained, it can be provided on a DVD disc.

So if the relatively low level of intellectual rigor required to categorize into topical databases the kinds of information that interest me seems impressive, go ahead and call me “Metadata man” – to any tune you like. :slight_smile:

I was thinking Metadata man … to the secret agent man tune.

However, you are sooooo far beyond a typical user, and that’s who we write software programs for, the users. Remember the users. It is so easy to lose touch with those who are sometimes confused by left/right click.

Complexity under the hood, simplicity at the fingertips.

And, yes, I did check out the link you provided. I also Googled for DEVONthink reviews, and almost every review noted the lack of spotlight support. Users in forums bemoan the fact too. I’m glad the support is on the way. Spotlight support will take this already powerful program and make it even more accessible for the users. I can’t wait.

DEVONthink is terrificly powerful and hopefully it will only get better.

Cheers.

I’ll weigh in here. I use DT precisely because Spotlight is such a PITA. I’m not the least bit impressed with it. It doesn’t capture any of my Apple Mail accurately. It flashes the subject line in PowerMail for a nanasecond then changes that to some hexadecimal nonsense so that I dont know what I’m looking at.

You can’t search within the data.

So many of the search returns are nonsensical and unnecessary; unweighted.

And I find that EasyFind gets stuff that Spotlight misses. Foxtrot is far better for searching in PowerMail, but it uses the Spotlight engine somehow, so I have to leave Spotlight on.

BTW b.3, Bill was a “typical user” on these boards for years before the DT guys wisely hired him.

Bill could probably be accused of a lot :slight_smile: but being a typical user just might not fly. As I’ve lurked here I’ve seen Bill’s frequent and helpful posts. I decided to weigh in on this topic because it speaks to the only real detractor DEVONthink has, and that is usability. So many powerful and advanced features are packed inside, and when usability of the program matches its functionality, watch out.

As for Spotlight, I have no complaints. When I search for e-mail or files on my computer (and this is every day) Spotlight finds what I’m looking for quickly and efficiently. SpotInside is a pretty cool addition that highlights words within docs, but I rarely use it. Additionally, Spotlight ties right in to the Metadata associated with and stored in files. Need pics with only certain Aperture or exposure time? How about a smart folder rolling those items up for you automatically? Audio files by bitrate? No problem and no configuration needed.

A quick Google shows DEVONtechnologies has sort of been out against Spotlight since the beginning. I can understand this since there is terrific search capability built into DEVONthink, and it was there before Spotlight. However, will Spotlight last? Yes. Will it get more and more powerful? Yes. They are extending Spotlight to the server even now with 10.5. The reviews on Spotlight are overwhelmingly good. And, users on every Mac are using Spotlight every single day.

I’m not attacking Bill, and I’m certainly not attacking DEVONthink, rather I’m adding my perspective as an avid DEVONthink user and supporter.