Why Ruby Class Methods Resist Refactoring

klochner · on April 2, 2014

Ruby shines because it is a terse/expressive language that doesn't need heavy use of patterns, so you can grok a lot of logic at a glance. You want a hash? Write it inline, you don't need a lazy-loaded memoized function.

The original was the easiest to understand, and the refactored versions didn't expose much that would benefit from testing or code reuse. This feels over-engineered to me.

And not to nitpick, but since when is class << self an ugly form? It makes perfect sense if all methods will be class methods.

ryanto · on April 2, 2014

The biggest problem with "class << self" syntax is that it is hard to read in classes that have more than a handful or so lines. At quick glance you might not be able to tell if "def xxx" is a class method or instance method. However, with "def self.xxx" it is more easily recognizable.

Btw, I think its a small nitpick. Plenty of great code bases use "class << self".

klochner · on April 2, 2014

Totally agree, that's why I said 'if all methods are class methods...', in which case class<<self should go at the top of the class body. If there are instance methods I prefer def self.bar.

charlysisto · on April 2, 2014

I don't think there's any use of a heavy pattern in there, it's just leveraging the basic OO idea of state.

My rule of thumb is to use class methods as factories to initialize your object in a certain way, which is quit consistent with what the class method `new` does. Once you stay in that line of thought the rest flows quit naturally without over-thinking anything.

Dirlewanger · on April 2, 2014

Definitely over-engineered, doubly so if this class has no interaction with any other classes...and it doesn't appear that it would. It seems the class does a single discrete function, so I don't see anything wrong with one easy-to-read method as opposed to needlessly abstracting it out into n parts.

stormbrew · on April 2, 2014

I feel like this misses the real reason the construction presented is awkward. It's not because class methods are hard to refactor, it's because class methods in ruby are not static methods. The thing being attempted seems to be to avoid creating a method that doesn't access state, as if this were C++ and state is relatively expensive.

A class method in ruby is not a static method. It is an instance method on the class. It sounds like nitpicking, but the distinction is real. You call #new on a class because the class is The Thing that makes objects, not an awkward pseudo-global scope. Class methods should thus be things for which the class is the subject.

simon_weber · on April 2, 2014

"The signature of 'this shouldn't be a class' is that it has two methods, one of which is __init__." [0]

Why not write a function? This addresses all of the objections. Then refactor by pulling out concerns into other functions.

Need to share logic around the functions? Use a decorator. Passing too many attributes? Create a class to contain them.

I think classes are best avoided until it's obvious that encapsulating state provides a benefit, which I don't see in this case.

[0] https://www.youtube.com/watch?v=o9pEzgHorH0&t=3m

stormbrew · on April 2, 2014

Comparisons to python are difficult here because in Ruby there's no such thing as an object that doesn't have a class. Even the classes have classes (which is, as I touched on elsewhere, what class methods actually are -- instance methods on the class).

You can store a proc in a variable or a constant, but you're getting into some very ugly patterns at that point and you still haven't really left classes behind.

In Ruby everything is encapsulated and there's really no point in fighting that.

lmm · on April 2, 2014

> Comparisons to python are difficult here because in Ruby there's no such thing as an object that doesn't have a class. Even the classes have classes (which is, as I touched on elsewhere, what class methods actually are -- instance methods on the class

So exactly the same as Python then?

> You can store a proc in a variable or a constant, but you're getting into some very ugly patterns at that point and you still haven't really left classes behind.

How so? If you need to pass around something that quacks like a proc, do it as a proc.

stormbrew · on April 2, 2014

> So exactly the same as Python then?

Recently? Sort of, yes. Though in python it's more like everything has a blessed hash, and the assumption that everything is an object is much more thoroughly baked in in Ruby. Also the way that classes interact with their instances is quite different.

For example, in python a member of the class that is not overridden in an instance also happens to be a member of an instance. This is how default attributes work:

    >>> class x(object):
    ...     pass
    ... 
    >>> x.blah = 1
    >>> x().blah
    1

This actually gives you something qualitatively like a static method (through the staticmethod decorator), it is callable from an instance as well as the class and behaves always as if it has no access to the instance's state.

This kind of blending of instance and class doesn't happen in Ruby. The class is a completely distinct object with its own independent state, and is instantiated just like any other object, some syntactic sugar aside. It is an instance of class Class and its methods are the methods of class Class, while instances created by calling its #alloc method have the methods that are defined on it.

Python now shares some concept of everything being an object, I'll grant, but their models remain quite different and you really do have to work significantly against the grain to effectively use ruby in a non-OO fashion. In ways you simply don't in Python.

programminggeek · on April 2, 2014

I actually like class methods for one particular benefit - namespacing. Actually, in Ruby, I use modules for that, but same difference in Ruby right...?

Functionally though, there isn't much difference between Module.do_stuff() and module_do_stuff() and if you are writing functional code, there isn't much reason to care either way too much. Most Ruby I see though isn't terribly functional.

programminggeek · on April 2, 2014

One of the biggest problems in ruby is that there is often no clear difference in code between a method and a variable. This leads to some seriously bad code by developers for the sake of elegance.

For example, you can have a Model.foo method that seems harmless enough, except that actually calls other methods on other models to do some calculation that maybe two or three layers deep makes a call to an external service. Let's say for the sake of example that two or three levels of database queries and api service calls ends up spawning hundreds of queries and at least dozens of external api calls. This takes anywhere from 10 to 30 seconds if the dataset is large enough.

A junior programmer would think he is doing the right thing by simply calling Model.foo because he might be so foolish to not know the difference between Model.foo and Model.foo(). Since Rails by default doesn't annotate database model attributes, it's even easier to think that Model.foo Model.bar Model.whatever are just attributes so there would be no performance penalty for using them.

As it turns out, this makes your code terribly slow even though it looks "elegant" and "clean". Why? Because it is super non-obvious to many/most programmers that there is a potentially huge difference between a simple attribute on a model vs. a calculated attribute method derived from potentially a multitude of other data sources. When you are calling Model.foo and Model.bar, it is completely unclear that one might be a method and one might be an attribute.

I've seen this happen many times to both really good and really average Ruby programmers. At this point it is a genuine flaw in Ruby that is compounded by the way I see people using ActiveRecord and Rails.

ryanto · on April 2, 2014

I think it is unfair to single Ruby out here and call it a "flaw". What you are describing is just bad code/design that can exist in any language.

Also, as far as Ruby is concerned your class's external API is only made up of methods. There are no attributes, model.foo is always a method. Because of this your long running or resource intensive methods need to be documented or have clear names if you are exposing them as public API. In other words: Model.foo shouldn't take 30seconds, but maybe Model.download_foo will.

rtpg · on April 2, 2014

>What you are describing is just bad code/design that can exist in any language.

Actually the problem he is describing is a consequence of language design, not of architecture design.

In Javascript, if I write foo.bar, I know that it is accessing some field of the foo object, and that's it. Unfortunately we're going to get a bunch of stuff in ES6 that's going to change this, but the fact that we can have some invariants (especially in a dynamic language like Ruby) on property accessors is pretty important when trying to reason about performance.

The fact that Ruby only exposes methods is a language design decision that can enable some pretty bad architectural decisions.

Language features are about tradeoffs .For example, among other things, the calling syntax for ruby means that point-free functional programming is pretty much impossible (though there are possible language-level solutions for this). Depending on what you're looking for, that can be considered a flaw , because it deviates from your personal definition of the ideal programming environment.

btown · on April 2, 2014

It's interesting that you mention Javascript. There was recently a post on CoffeeScript that hit the frontpage, and one reference it linked was to a point of contention between CoffeeScript's developers and a library creator over whether CoffeeScript should introduce a nontrivial amount of wrapper code to support attribute setters and getters [1]. The library author made the following argument:

    For CoffeeKit specifically, the getters for all accessor 
    properties are totally without side effects. They are getters 
    because they have to wrap access through objective-c selectors. 
    I can't make them fields. I also can't in good conscience 
    make them all functions, requiring people to write:
    
    uikit.UIScreen.getMainScreen().getBounds()
    
    instead of:
    
    uikit.UIScreen.mainScreen.bounds
    
    yuck.

This, however, speaks directly to the point at hand: one could argue that a library creator in good conscience should make them all functions, since it would force library users to "reason about performance." It's tempting to remove extra syntax whenever possible, but sometimes verbosity can reduce technical debt insofar as it encourages developers to think about performance.

Ruby on Rails definitely leans towards reducing verbosity, but it has such a good ecosystem that even though I'd like a platform that makes things a bit more apparent about what things have/don't have side effects, it's still the best option for many projects that need to be created fast and maintained into the future. I'm looking forward to ES6 generators in Node.js, since they promise to create an ecosystem that avoids "callback hell" but still make it explicit where, in each level of code, expensive operations might occur [2].

[1] https://blog.toshokelectric.com/why-im-ditching-coffeescript... [2] http://zef.me/6096/callback-free-harmonious-node-js

jordibunster · on April 2, 2014

I don't get it. They're both methods. There's no such thing as "attributes" in Ruby.

programminggeek · on April 2, 2014

Sorry, perhaps I wasn't clear enough. On an ActiveRecord model you will have attributes that are simple values in a table and you can have methods on an object that do stuff. They might both be methods, but the difference between getting a simple attribute and doing a calculation can be incredibly huge once you have models related to other models and calculations that span between them.

In my experience ActiveRecord doesn't handle that particularly well and programmers don't readily know/see the difference because they all look the same, especially in a view.

jshen · on April 2, 2014

This has nothing to do with ruby. In any language you can have a method that returns an attribute or a method that calculates something.

dragonwriter · on April 2, 2014

> One of the biggest problems in ruby is that there is often no clear difference in code between a method and a variable.

The only even superficial ambiguity is between calls to methods on the current object and local variable references.

> For example, you can have a Model.foo method that seems harmless enough, except that actually calls other methods on other models to do some calculation that maybe two or three layers deep makes a call to an external service.

That's not a lack of clear difference in code between a method and a variable, since when its referenced in external code its unmistakably a method, not a variable reference (its quite awkward to reference an object's variables from outside in Ruby). Its that Ruby is doesn't have syntactic conventions which imply things about the implementation of methods (the way, e.g., C# has with properties, which are simply a gate way to call methods, though there are fairly strong -- though not universally observed, and therefore dangerous to rely on -- conventions about what the underlying getter and setter methods should and should not do.)

Personally, I think Ruby's way is the right way if you don't have strong and language (e.g., compiler) enforced guarantees.

Argorak · on April 2, 2014

Hm, would you also agree that it is a flaw of Haskell that there is no difference between a value and function with no arguments in Haskell?

The syntactical equality between a variable and a method call on self has the advantage to make things like this possible. E.g. if i have:

    foo = DataBase.lazy_load(:foo)
    do_something_with(foo)

can easily be refactored to

    def foo
      DataBase.lazy_load(:foo)
    end

    do_something_with(foo)

I think that to be a strong-point of the Ruby language, speaking about values and not about the technicalities. Also, every external access to a value has to happen through a Method.

Also, you are complaining about the Rails API more than anything else. E.g. in DataMapper, it was quite obvious that those attributes do hard work (e.g. coercion), because you had to explicitly define them in the first place.

rtpg · on April 2, 2014

There's no such thing as a function with no arguments in Haskell.

f :: () -> T is a function that still takes a (meaningless) value

g = map f list is just a value.

kazagistar · on April 2, 2014

An (IO a), or even more so, a (State a) or (Reader a) very often seems exactly like a function with no arguments in any imperative language.

Argorak · on April 2, 2014

Thats what I meant, everything is a function and can be expressed as one, even most syntax (the literal 1 => Num a -> a)

hayksaakian · on April 2, 2014

Its all about abstraction.

If you're calling Model.foo or Model.bar, you don't care about the specific implementation, you just want foo or bar.

The calling code doesn't need to know how Model works.

This way you can refactor either ones into normal attributes down the road without changing any dependant code.

programminggeek · on April 2, 2014

Perhaps, but the simple habit of Model.foo() (for methods) and Model.bar for data would make it easier to spot when one is an attribute and one is a method.

phamilton · on April 2, 2014

That seems to fit the definition of leaky abstraction (an implementation detail influencing whether how you use it). Leaky abstractions are all over.

That same logic would suggest that we should name our methods Model.foo_slow() and Model.bar_fast() because the execution speed is important and should be transparent to the caller.

In practice, Model.foo() ends up often being memoized so that all but the first call are practically equivalent to accessing an attribute.

ryanto · on April 2, 2014

This won't work because it is not refactor proof. What happens when your system changes and Model.bar needs to become a method that does computation.

Your suggestion violates the uniform access principle: http://en.wikipedia.org/wiki/Uniform_access_principle

dperfect · on April 2, 2014

Any performance issues (which you brought up before the edit) should be entirely the concern of the class whose method is causing the issues. Anything on the outside shouldn't have to rely on syntactic clues to discover when something may be slower.

If I care to call person.eye_color, I don't care (as the caller) how Person implements eye_color (it may be a simple attribute or do some ridiculously slow color calculation) - if I need the color, I need the color. Now, that's not to say that performance doesn't matter, just that any performance concerns should be addressed in Person, not in trying discourage caller from calling eye_color and trying to get at that value some other way.

yaur · on April 2, 2014

RubyMine at least will complain if you put the() after a method that doesn't need it. Coming from C# that lack of differentiation kind of irks me.

joevandyk · on April 2, 2014

You do care about the specific implementation for Model.foo if it could take 30 seconds to return a result.

krosaen · on April 2, 2014

In some ways using an object to share parameters between helpers is an alternative to having a function that can have internal helper methods that all have access to the original parameters via the shared enclosed environment. But that doesn't make this "object oriented" and I don't see why that should be a goal in and of itself.

kyllo · on April 2, 2014

A class is just a reusable definition of a data structure. The SyncToAnalyticsService example is not really a data structure at all, it's just a function. It's a verb, not a noun. But it's made into a class just because OOP and patterns. This feels like someone tried to write Java in Ruby.

ollysb · on April 2, 2014

Why not just add a new class which is instantiated and called inside the perform method? There are valid reasons for not choosing class methods but claiming it limits the way in which you implement a method is just nonsense.